Antibody categorization based on binding characteristics

ABSTRACT

Methods for categorizing antibodies based on their epitope binding characteristics are described. Methods and systems for determining the epitope recognition properties of different antibodies are provided. Also provided are data analysis processes for clustering antibodies on the basis of their epitope recognition properties and for identifying antibodies having distinct epitope binding characteristics.

RELATED APPLICATIONS

This application is a continuation of U.S. Nonprovisional patent application Ser. No. 13/480,356, filed May 24, 2012 and now U.S. Pat. No. 8,568,992, which is a continuation of U.S. Nonprovisional patent application Ser. No. 12/823,104, filed Jun. 24, 2010 and now U.S. Pat. No. 8,206,936, which is a continuation of U.S. Nonprovisional patent application Ser. No. 10/309,419, filed Dec. 2, 2002, now U.S. Pat. No. 7,771,951, which claims priority to U.S. Provisional Patent Application Ser. No. 60/337,245, filed Dec. 3, 2001, and U.S. Provisional Patent Application Ser. No. 60/419,387, filed Oct. 16, 2002, all of which are hereby incorporated by reference herein.

FIELD OF THE INVENTION

The present invention relates to grouping antibodies based on the epitopes they recognize and identifying antibodies having distinct binding characteristics. In particular, the present invention relates to antibody competition assay methods for determining antibodies that bind to an epitope, and data analysis processes for dividing antigen-specific antibodies into clusters or “bins” representing distinct binding specificities. Specifically, the invention relates to the Multiplexed Competitive Antibody Binning (MCAB) high-throughput antibody competition assay and the Competitive Pattern Recognition (CPR) data analysis process for analyzing data generated by high-throughput assays.

BACKGROUND OF THE INVENTION

Monoclonal antibodies (mAb) show an important therapeutic utility in the treatment of a wide variety of diseases such as infectious diseases, cardiovascular disease, inflammation, and cancer. (Storch (1998) Pediatrics 102:648-651; Coller et al. (1995) Thromb. Haemostasis 74:302-308; Present et al. (1999) New Eng. J. Med. 6:1398-1405; Goldenberg (1999) Clin. Ther. 21:309-318). Cells produce antibodies in response to infection or immunization with a foreign substance or antigen. The potential therapeutic utility of monoclonal antibodies is in part due to their specific and high-affinity binding to a target. Antibodies bind specifically to a target antigen by recognizing a particular site, or epitope, on the antigen. With the use of the recently developed XenoMouse® technology (Abgenix, Inc., Fremont, Calif.) together with established procedures for hybridoma cells or B cells (Kohler and Milstein (1975) Nature 256:495-497) and isolating lymphocytes (Babcook et al. (1996) Proc. Natl. Acad. Sci. 93:7843-7848), it is possible to generate large numbers of antigen-specific human monoclonal antibodies against almost any human antigen. (Green (1999) Jnl. Immunol. Methods 231:11-23, Jakobovits et al. (1993) Proc. Natl. Acad. Sci. U.S.A., 90:2551-2555, Mendez et. al. (1997) Nat. Genet. 15:146-156; Green and Jakobivits, J. Exp. Med. (1998), 188:483-495).

The large numbers of antibodies generated against a particular target antigen may vary substantially in terms of both how strongly they bind to the antigen as well as the particular epitope they bind to on the target antigen. Different antibodies generated against an antigen recognize different epitopes and have varying binding affinities to each epitope. In order to identify therapeutically useful antibodies from the large number of generated candidate antibodies, it is necessary to screen large numbers of antibodies for their binding affinities and epitope recognition properties. For this reason, it would be advantageous to have a rapid method of screening antibodies generated against a particular target antigen to identify those antibodies that are most likely to have a therapeutic effect. In addition, it would be advantageous to provide a mechanism for categorizing the generated antibodies according to their target epitope binding sites.

SUMMARY OF THE INVENTION

The present disclosure provides methods for categorizing antibodies based on their epitope binding characteristics. One aspect provides methods and systems for determining the epitope recognition properties of different antibodies. Another aspect provides data analysis processes for clustering antibodies on the basis of their epitope recognition properties and for identifying antibodies having distinct epitope binding characteristics. Antibody categorization or “binning” as disclosed and claimed herein encompasses assay methods and data analysis processes for determining the epitope binding characteristics of a pool of antigen-specific antibodies, clustering antibodies into “bins” representing distinct epitope binding characteristics, and identifying antibodies having desired binding characteristics.

The method for categorizing antibodies based on binding characteristics includes:

a) providing a set of antibodies that bind to an antigen, labelling each antibody in the set to form a labelled reference antibody set such that each labelled reference antibody is distinguishable from every other labelled reference antibody in the labelled reference antibody set, selecting a probe antibody from the set of antibodies that bind to the antigen, contacting the probe antibody with the labelled reference antibody set in the presence of the antigen, detecting probe antibody in a complex that includes a labelled reference antibody bound to antigen, the antigen, and probe antibody bound to antigen; and

b) providing input data representing the outcomes of at least one antibody competition assay using a set of antibodies that bind to an antigen as in step a), normalizing the input data to generate a normalized intensity matrix. computing at least one dissimilarity matrix comprising generating a threshold matrix from the normalized intensity matrix and computing a dissimilarity matrix from the threshold matrix, and clustering antibodies based on dissimilarity values in cells of the dissimilarity matrix, to determine epitope binding patterns of set of antibodies that bind to an antigen. The input data can be generated by a high throughput assay, preferably by the Multiplexed Competitive Antibody Binning (MCAB) assay. Preferably, the Competitive Pattern Recognition process is used for data analysis.

For the antibody competition assay method, the probe antibody may be labelled, and detected in a complex that includes a labelled reference antibody bound to antigen, antigen, and labelled probe antibody bound to antigen, which allows determination whether the labelled probe antibody competes with any reference antibody in the labelled reference antibody set, because competition indicates that the probe antibody binds to the same epitope as another antibody in the set of antibodies that bind to an antigen. The probe antibody can be labelled, for example, with an enzymatic label, or a colorimetric label, or a fluorescent label, or a radioactive label.

In a preferred embodiment of the antibody competition assay method, a detection antibody is used to detect bound probe antibody, where the detection antibody binds only to probe antibody and not to reference antibody. The detection antibody detects bound probe antibody in complex that includes a labelled reference antibody bound to antigen, antigen, and labelled probe antibody bound to antigen. A labelled detection antibody is used to detect bound probe antibody, where the detection antibody can be labelled, for example, with an enzymatic label, or a colorimetric label, or a fluorescent label, or a radioactive label. Alternately, the detection antibody is detected using a detection means such as an antibody-binding protein.

In particular, the antibody competition assay method for determining antibodies that bind to an epitope on an antigen includes providing a set of antibodies that bind to an antigen, labelling each antibody in the set with a uniquely colored bead to form a labelled reference antibody set such that each labelled reference antibody is distinguishable from every other labelled reference antibody in the labelled reference antibody set, selecting a probe antibody from the set of antibodies that bind to the antigen, contacting the probe antibody with the labelled reference antibody set in the presence of the antigen, detecting bound probe antibody in a complex that includes a labelled reference antibody bound to antigen, antigen, and probe antibody bound to antigen, and determining whether the probe antibody competes with any reference antibody in the labelled reference antibody set, where competition indicates that the probe antibody binds to the same epitope as another antibody in the set of antibodies that bind to an antigen. Each uniquely colored bead may have a distinct emission spectrum. The probe antibody can be labelled, for example, with an enzymatic label, or a colorimetric label, or a fluorescent label, or a radioactive label. Alternately, a detection antibody is used to detect bound probe antibody, where the detection antibody may be labelled. The labelled detection antibody can be labelled, for example, with an enzymatic label, or a colorimetric label, or a fluorescent label, or a radioactive label.

Another aspect of the present invention provides a method for characterizing antibodies based on binding characteristics by providing input data representing the outcomes of at least one antibody competition assay using a set of antibodies that bind to an antigen. normalizing the input data to generate a normalized intensity matrix. computing at least one dissimilarity matrix comprising generating a threshold matrix from the normalized intensity matrix and computing a dissimilarity matrix from the threshold matrix, and clustering antibodies based on dissimilarity values in cells of the dissimilarity matrix, to determine epitope binding patterns of set of antibodies that bind to an antigen.

The input data can be signal intensity values representing the outcomes of an antibody competition assay using a set of antibodies that bind to an antigen. The input data representing the outcomes of an antibody competition assay can be stored in matrix form. The input data stored in matrix form can be in a two-dimensional matrix or a multidimensional matrix, and may be stored in a plurality of matrices. The input data stored in matrix form can be signal intensity values representing the outcomes of an antibody competition assay. The antibody competition assay can be the Multiple Competitive Antibody Binning (MCAB) assay. The input data stored in matrix form can be at least one matrix wherein each cell of the matrix comprises the signal intensity value of an individual antibody competition assay.

Normalizing the input data to generate a normalized intensity matrix can include generating a background-normalized intensity matrix by subtracting a first matrix with signal intensity values from a first antibody competition assay in which antigen was not added (negative control) from a second matrix with signal intensity values from a second antibody competition assay in which antigen was added. A minimum threshold value for blocking buffer values is set, and any blocking buffer values below the threshold value are adjusted to the threshold value prior to said generating the normalized intensity matrix.

If desired, the normalizing step includes generating an intensity-normalized matrix by dividing each value in a column of the background-normalized intensity matrix by the blocking buffer intensity value for the column. The normalizing step can further include normalizing relative to the baseline signal for probe antibodies by dividing each column of the intensity-normalized matrix by its corresponding diagonal value to generate a final intensity-normalized matrix. Prior to dividing each column by its corresponding diagonal value, each diagonal value is compared with a user-defined threshold value and any said diagonal value below the user-defined threshold value is adjusted to the threshold value.

If desired, the normalizing step includes generating an intensity-normalized matrix by dividing each value in a row of the background-normalized intensity matrix by the blocking buffer intensity value for the row. The normalizing step can further include normalizing relative to the baseline signal for probe antibodies by dividing each row of the intensity-normalized matrix by its corresponding diagonal value to generate a final intensity-normalized matrix. Prior to dividing each row by its corresponding diagonal value, each diagonal value is compared with a user-defined threshold value and any said diagonal value below the user-defined threshold value is adjusted to the threshold value.

Generating the threshold matrix involves setting the normalized valued in each cell of the normalized intensity matrix to a value of one (1) or zero (0), wherein normalized values less than or equal to a threshold value are set to a value of zero (0) and normalized values greater to a threshold value are set to a value of one (1).

At least one dissimilarity matrix is computed from the threshold matrix of ones and zeroes by determining the number of positions in which each pair of rows differs. A plurality of dissimilarity matrices can be computed using a plurality of threshold values. The average of a plurality of dissimilarity matrices can computed and used as input to the clustering step.

Clustering antibodies based on said dissimilarity values in cells of said dissimilarity matrix can include hierarchical clustering. Hierarchical clustering includes generating a hierarchy of nested subsets of antibodies within a set of antibodies that bind to an antigen by determining the pair of antibodies in the set having the lowest dissimilarity value, then determining the pair of antibodies having the next lowest dissimilarity value, and iteratively repeating this determining each pair of antibodies having the next lowest dissimilarity value until one pair of antibodies remains, such that a hierarchy of nested subsets is generated that indicates the similarity of competition patterns within the set of antibodies. Clusters are determined based on competition patterns.

Alternately, the data analysis process involves subtracting the data matrix for the experiment carried out with antigen from the data matrix for the experiment without antigen. The value in each diagonal cell is then used as a background value for determining the binding affinity of the antibody in the corresponding column. Cells in the subtracted matrix having values significantly higher than the corresponding diagonal value are highlighted or otherwise noted.

Data from the clustering step can be captured, including by automated means. In particular, data from the clustering step can be captured in a format compatible with data input device or computer. The clustering step can generate a display, which can be in a format compatible with data input device or computer. The display generated by the clustering step can be a dissimilarity matrix. Clusters can be determined by visual inspection of the dissimilarity value in each cell of the dissimilarity matrix. The dissimilarity matrix can include cells having a visual indicator of the cluster to which the antibody pair represented by said cell belongs, where the visual indicator may be a color, or shading, or patterning. Alternately, the display can be a dendrogram defined by dissimilarity values computed for a set of antibodies. Such a dendrogram has branches representing antibodies in the set of antibodies, wherein the arrangement of branches represents relationships between antibodies within the set of antibodies, and the arrangement further represents clusters of antibodies within the set of antibodies. In such a dendrogram, the length of any branch represents the degree of similarity between the binding pattern of antibodies or cluster of antibodies represented by said branch.

Input data representing the outcomes of a plurality of antibody competition assays can be analyzed, wherein each assay represents an individual experiment using a set of antibodies that bind to an antigen, further wherein each experiment includes at least one antibody that is also tested in at least one other experiment. When a plurality of experiments is analyzed, an individual normalized intensity matrix can be generated for each individual experiment and a single normalized intensity matrix can be generated by computing the average intensity value of each antibody pair represented in each individual normalized intensity matrix. A single dissimilarity matrix representing each antibody pair tested in the plurality of antibody competition assays can be generated from the single normalized intensity matrix.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1. Schematic illustration of one embodiment of an epitope binning assay using labelled bead technology in a single well of a microtiter plate. Each reference antibody is coupled to a bead with distinct emission spectrum, forming a uniquely labelled reference antibody. The entire set of uniquely labelled reference antibodies is placed in the well of a multiwell microtiter plate and incubated with antigen. A probe antibody is added and the interaction of probe antibody with each uniquely labelled reference antibody is determined.

FIG. 2. Correlation between blocking buffer intensity values and average intensity. FIG. 2A. Correlation between blocking buffer intensity and average intensity within rows. Blocking buffer intensity value for each row (y-axis) plotted against the average intensity value of the row with blocking buffer value omitted (x-axis). Fitting a line to the data shows a strong linear correlation between the blocking buffer values and the average intensity values of the rest of the row. FIG. 2B. Correlation between blocking buffer intensity and average intensity within columns. Blocking buffer intensity value for each column (y-axis) plotted against the average intensity value of the column with blocking buffer value omitted (x-axis). Fitting a line to the data shows a relatively weak linear correlation between the blocking buffer values and the average intensity values of the rest of the column. FIG. 2C. Scatter plot of intensity values for the matrix with antigen and background-normalized matrix. This plot shows a tight linear correlation (slope about 1.0) for high subtracted signal values, indicating that the background signal is minimal relative to the signal in the presence of antigen. The points are shaded according to the value of the fraction, calculated as the subtracted signal divided by the signal for the experiment with antigen present. Smaller fraction values (closer to zero) correspond to high background contribution and have light shading. Larger fraction values (closer to 1) correspond to lower background contribution and have darker shading.

FIG. 3. Comparison of epitope binning results with FACS results. Results from antibody experiments using the ANTIGEN39 antibody are shown, comparing results using the epitope binning method described herein with results using flow cytometry (fluorescence-activated cell sorter, FACS). Antibodies are assigned to bins 1-15, as indicated by rows 1-15 in the far left column using the epitope binning assay. Hatching in cells indicates antibodies that are FACS positive for cells expressing ANTIGEN39 (cell line 786-0), and no hatching indicates antibodies that are negative for cells that do not express ANTIGEN39 (cell line M14).

FIG. 4. Dissimilarity vs. background value: effect of choice of threshold cutoff value. The figure shows the amount of dissimilarity between antibodies 2.1 and 2.25 calculated at various threshold values. The amount of dissimilarity represents the value for the dissimilarity matrix for the entry corresponding to the two antibodies, Ab 2.1 and Ab 2.25 for a series of dissimilarity matrices computed using different threshold values. Here, the x-axis is the threshold value, and the y-axis is the dissimilarity value calculated using that threshold cutoff value.

FIG. 5. Dendrogram for the ANTIGEN14 antibodies. The length of branches connecting two antibodies is proportional to the degree of similarity between the two antibodies. This figure shows that there are two very distinct epitopes recognized by these antibodies. One epitope is recognized by antibodies 2.73, 2.4, 2.16, 2.15, 2.69, 2.19, 2.45, 2.1, and 2.25. A different epitope is recognized by antibodies 2.13, 2.78, 2.24, 2.7, 2.76, 2.61, 2.12, 2.55, 2.31, 2.56, and 2.39. Antibody 2.42 does not have a pattern that is very similar to any other antibody, but has some noticeable similarity to the second cluster, although it may recognize yet a third epitope which partially overlaps with the second epitope.

FIG. 6. Dendrograms for ANTIGEN39 antibodies. FIG. 6A. Dendrogram for the ANTIGEN39 antibodies for five input experimental data sets. The number o unique clusters of antibodies suggests that are several different epitopes, some of which may overlap. For example, the cluster containing antibodies 1.17, 1.55, 1.16, 1.11 and 1.12 and the cluster containing 1.21, 2.12, 2.38, 2.35, and 2.1 appear to be fairly closely related, with each antibody pair with the exception of 2.35 and 1.11 being no more than 25% different. This high degree of similarity across the two clusters suggests that the two different epitopes themselves have a high degree of similarity. FIG. 6B. Dendrogram for the ANTIGEN39 antibodies for Experiment 1. Antibodies 1.12, 1.63, 1.17, 1.55, and 2.12 consistently cluster together in this experiment as well as in other experiments, as do antibodies 1.46, 1.31, 2.17, and 1.29. FIG. 6C. Dendrogram for the ANTIGEN39 antibodies for Experiment 2. Antibodies 1.57 and 1.61 consistently cluster together in this experiment as well as in other experiments. FIG. 6D. Dendrogram for the ANTIGEN39 antibodies for Experiment 3. Antibodies 1.55, 1.12, 1.17, 2.12, 1.11 and 1.21 consistently cluster together in this experiment as well as in other experiments. FIG. 6E. Dendrogram for the ANTIGEN39 antibodies for Experiment 4. Antibodies 1.17, 1.16, 1.55, 1.11 and 1.12 consistently cluster together in this experiment as well as in other experiments, as do antibodies 1.31, 1.46, 1.65, and 1.29, as well as antibodies 1.57 and 1.61. FIG. 6F. Dendrogram for the ANTIGEN39 antibodies for Experiment 5. Antibodies 1.21, 1.12, 2.12, 2.38, 2.35, and 2.1 consistently cluster together in this experiment as well as in other experiments.

FIG. 7. Dendrograms for clustering IL-8 monoclonal antibodies. FIG. 7A. Dendrograms for a clustering of seven IL-8 monoclonal antibodies. The dendrogram on the left is generated by clustering columns, and the dendrogram on the right by clustering rows of a background-normalized signal intensity matrix. Both dendrograms indicate that there are two epitopes, using a dissimilarity cutoff of 0.25: one epitope is recognized by monoclonal antibodies HR26, a215, a203, a393, and a452; a second epitope is recognized by monoclonal antibodies K221 and a33. FIG. 7B. Dendrograms for IL-8 monoclonal antibodies from a combined clustering analysis merging five different experimental data sets. The dendrogram on the left was generated by clustering columns, whereas the dendrogram on the right was generated by clustering rows of the background-normalized signal intensity matrix. Both dendrograms indicate that there are two epitopes, using a dissimilarity cut-off of 0.25: one epitope is recognized by monoclonal antibodies a809, a928, HR26, a215, and D111; a second epitope is recognized by monoclonal antibodies a837, K221, a33, a142, a358, and a203, a393, and a452. FIG. 7C. Dendrograms for a clustering of nine IL-8 monoclonal antibodies. The dendrogram on the left was generated by clustering columns, and the dendrograms on the right by clustering rows of the background-normalized signal intensity matrix. Both dendrograms indicate that there are two epitopes, using a dissimilarity cut-off of 0.25: one epitope is recognized by monoclonal antibodies HR26 and a215; a second epitope is recognized by monoclonal antibodies K221, a33, a142, a203, a358, a393, and a452.

FIG. 8. Intensity matrices generated in the embodiment disclosed in Example 2 using a set of antibodies against ANTIGEN14. FIGS. 8A and 8B are tables showing the intensity matrix for experiment conducted with antigen. FIGS. 8C and 8D are tables showing the intensity matrix for the same experiment conducted without antigen (control). These matrices are used as input data matrices for subsequence steps in data analysis.

FIGS. 9A-9B. Difference matrix for antibodies against the ANTIGEN14 target. Difference matrix is generated by subtracting the matrix corresponding to values obtained from experiment without antigen (see FIG. 8B) from the matrix corresponding to values obtained from the experiment with antigen (see FIG. 8A) disclosed in Example 2.

FIGS. 10A-10B. Adjusted difference matrix with minimum threshold value. For the intensity values of Example 2, the minimum reliable signal intensity value is set to 200 intensity units and values below the minimum threshold are set to the threshold of 200.

FIGS. 11A-11C. Row normalized matrix. Each row in the adjusted difference matrix of FIG. 10 is adjusted by dividing it by the last intensity value in the row, which corresponds to the intensity value for beads to which blocking buffer is added in place of primary antibody. This adjusts for well-to-well intensity.

FIGS. 12A-12C. Diagonal normalized matrix. All columns except the one corresponding to Antibody 2.42 were column-normalized. Dividing each column by its corresponding diagonal is carried out to measure each intensity relative to an intensity that is known to reflect competition—i.e., competition against self.

FIGS. 13A-13B. Antibody pattern recognition matrix. For data from the embodiment disclosed in Example 2, intensity values below the user-defined threshold were set to zero. The user-defined threshold was set to two (2) times the diagonal intensity values. Remaining values were set to one.

FIGS. 14A-14B. Dissimilarity matrix. For data from the embodiment disclosed in Example 2, a dissimilarity matrix is generated from the matrix of zeroes and ones shown in FIG. 13, by setting the entry in row i and column j to the fraction of the positions at which two rows, i and j, differ. FIG. 14 shows the number of positions, out of 22 total, at which the patterns for any two antibodies differed for set of antibodies generated against the ANTIGEN14 target.

FIGS. 15A-15C. Average dissimilarity matrix. After separate dissimilarity matrices were generated from each of several threshold values ranging from 1.5 to 2.5 times the values of the diagonals, the average of these dissimilarity matrices was computed (FIG. 15) and used as input to the clustering process.

FIGS. 16A-16C. Permuted average dissimilarity matrix. For data from the embodiment disclosed in Example 2, clusters can be visualized in matrices. In FIG. 16, the rows and columns of the dissimilarity matrix were rearranged according to the order of the “leaves” or clades on the dendrogram shown in FIG. 5, and individual cells were visually coded according to the degree of dissimilarity.

FIGS. 17A-17C. Permuted normalized intensity matrix. For data from the embodiment disclosed in Example 2, rows and columns of the normalized intensity matrix were rearranged according to the order of the leaves on the dendrogram shown in FIG. 5, and individual cells were visually coded according to their normalized intensity values.

FIGS. 18A-18J. Permuted average dissimilarity matrix for five ANTIGEN39 input data sets. Data from five experiments that were conducted using antibodies against the ANTIGEN39 target (see Example 3) produced five input data sets. Dissimilarity matrices were generated for each input data set, and an average dissimilarity matrix was generated, and rows and columns were arranged (permuted) according to arrangement of the corresponding dendrogram(s) shown in FIG. 6.

FIGS. 19A-19J. Permuted normalized intensity matrix for five ANTIGEN39 input data sets. Data from five experiments that were conducted using antibodies against the ANTIGEN39 target (see Example 3) produced five input data sets. A normalized intensity matrix was generated for the five input data sets and rows and columns were arranged (permuted) according to arrangement of the corresponding dendrogram(s) shown in FIG. 6.

FIGS. 20A-20B. Permuted average dissimilarity matrix for Experiment 1 using a set of antibodies against the ANTIGEN39 target. Data from the set of antibodies analyzed in Experiment 1 (Example 3) were analyzed. See dendrogram shown in FIG. 6B.

FIGS. 21A-21B. Permuted normalized intensity matrix for Experiment 1 using a set of antibodies against the ANTIGEN39 target. Data from the set of antibodies analyzed in Experiment 1 (Example 3) were analyzed. See dendrogram shown in FIG. 6B.

FIG. 22. Permuted average dissimilarity matrix for Experiment 2 using a set of antibodies against the ANTIGEN39 target. Data from the set of antibodies analyzed in Experiment 2 (Example 3) were analyzed. See dendrogram shown in FIG. 6C.

FIG. 23. Permuted normalized intensity matrix for Experiment 2 using a set of antibodies against the ANTIGEN39 target. Data from the set of antibodies analyzed in Experiment 2 (Example 3) were analyzed. See dendrogram shown in FIG. 6C.

FIGS. 24A-24B. Permuted average dissimilarity matrix for Experiment 3 using a set of antibodies against the ANTIGEN39 target. Data from the set of antibodies analyzed in Experiment 3 (Example 3) were analyzed. See dendrogram shown in FIG. 6D.

FIGS. 25A-25B. Permuted normalized intensity matrix for Experiment 3 using a set of antibodies against the ANTIGEN39 target. Data from the set of antibodies analyzed in Experiment 3 (Example 3) were analyzed. See dendrogram shown in FIG. 6D.

FIGS. 26A-26B. Permuted average dissimilarity matrix for Experiment 4 using a set of antibodies against the ANTIGEN39 target. Data from the set of antibodies analyzed in Experiment 4 (Example 3) were analyzed. See dendrogram shown in FIG. 6E.

FIGS. 27A-27B. Permuted normalized intensity matrix for Experiment 4 using a set of antibodies against the ANTIGEN39 target. Data from the set of antibodies analyzed in Experiment 4 (Example 3) were analyzed. See dendrogram shown in FIG. 6E.

FIGS. 28A-28C. Permuted average dissimilarity matrix for Experiment 5 using a set of antibodies against the ANTIGEN39 target. Data from the set of antibodies analyzed in Experiment 5 (Example 3) were analyzed. See dendrogram shown in FIG. 6F.

FIGS. 29A-29C. Permuted normalized intensity matrix for Experiment 5 using a set of antibodies against the ANTIGEN39 target. Data from the set of antibodies analyzed in Experiment 5 (Example 3) were analyzed. See dendrogram shown in FIG. 6F.

FIG. 30. Clusters identified in Experiments 1-5 using sets of antibodies against the ANTIGEN39 target. FIG. 30 summarizes the clusters identified for each of the five individual data sets and for the combined data set for all of the antibodies generated in all five experiments disclosed in Example 3.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

With increased fusion efficiency producing larger numbers of antigen specific antibodies from each hybridoma-cell fusion experiment, a screening method of managing and prioritizing large numbers of antibodies becomes ever more important. When a set of monoclonal antibodies has been generated against a target antigen, different antibodies in the set will recognize different epitopes, and will also have variable binding affinities. Thus, to effectively screen large numbers of antibodies it is important to determine which epitope each antibody binds, and to determine binding affinity for each antibody.

Epitope binning, as described herein, is the process of grouping antibodies based on the epitopes they recognize. More particularly, epitope binning comprises methods and systems for discriminating the epitope recognition properties of different antibodies, combined with computational processes for clustering antibodies based on their epitope recognition properties and identifying antibodies having distinct binding specificities. Accordingly, embodiments include assays for determining the epitope binding properties of antibodies, and processes for analyzing data generated from such assays.

In general, the invention provides an assay to determine whether a test moiety (such as an antibody) binds to a test object (such as an antigen) in competition with other test moieties (such as other antibodies). A capture moiety is used to capture the test object and/or the test moiety in an addressable manner and a detection moiety is utilized to addressably detect binding between other test moieties and the test object. When a test moiety binds to the same or similar location on the test subject as the test moiety being assayed, no binding is detected, whereas when a test moiety binds to a different location on the test subject as the test moiety being assayed, binding is detected. In each case, the binding or lack thereof is addressable, so the relative interactions between test moieties with the test object can be readily ascertained and categorized.

One embodiment of the invention is a competition-based method of categorizing a set of antibodies that have been generated against an antigen. This method relies upon carrying out a series of assays wherein each antibody from the set is tested for competitive binding against all other antibodies from the set. Thus, each antibody will be used in two different modes: in at least one assay, each antibody will be used in “detect” mode as the “probe antibody” that is tested against all the other antibodies in the set; in other assays, the antibody will be used in “capture” mode as a “reference antibody” within the set of reference antibodies being assayed. Within the set of reference antibodies, each reference antibody will be uniquely labelled in a way that permits detection and identification each reference antibody within a mixture of reference antibodies. The method relies on forming “sandwiches” or complexes involving reference antibodies, antigen, and probe antibody, and detecting the formation or lack of formation of these complexes. Because each reference antibody in the set is uniquely labelled, it is possible to addressably determine whether a complex has formed for each reference antibody present in the set of reference antibodies being assayed.

Antibody Assay Overview

The method begins by selecting an antibody from the set of antibodies against an antigen, where the selected antibody will serve as the “probe antibody” that is to be tested for competitive binding against all other antibodies of the set. A mixture containing all the antibodies will serve as a set of “reference antibodies” for the assay, where each reference antibody in the mixture is uniquely labelled. In an assay, the probe antibody is contacted with the set of reference antibodies, in the presence of the target antigen. Accordingly, a complex will form between the probe antibody and any other antibody in the set that does not compete for the same epitope on the target antigen. A complex will not form between the probe antibody and any other antibody in the set that competes for the same epitope on the target antigen. Formation of complexes is detected using a labelled detection antibody that binds the probe antibody. Because each reference antibody in the mixture is uniquely labelled, it is possible to determine for each reference antibody whether that reference antibody does or does not form a complex with the probe antibody. Thus, it can be determined which antibodies in the mixture compete with the probe antibody and bind to the same epitope as the probe antibody.

Each antibody is used as the probe antibody in at least one assay. By repeating this method of testing each individual antibody in the set against the entire set of antibodies, the competitive binding affinities can be generated for the entire set of antibodies against an antigen. From such a affinity measurements, one can determine which antibodies in the set have similar binding characteristics to other antibodies in the set, thereby allowing the grouping or “binning” of each antibody on the basis of its epitope binding profile. A table of competitive binding affinity measurements is a suitable method for displaying assay results. A preferred embodiment of this method is the Multiplexed Competitive Antibody Binning (MCAB) assay for high-throughput screening of antibodies.

Because this embodiment relies on testing antibody competition, wherein a single antibody is tested against the entire set of antibodies generated against an antigen, one challenge to implementing this method relates to the mechanism used to uniquely identify and quantitatively measure complexes formed between the single antibody and any one of the other antibodies in the set. It is this quantitative measurement that provides an estimate of whether two antibodies are competing for the same epitope on the antigen.

As described below, embodiments of the invention relate to uniquely labelling each reference antibody in the set prior to creating a mixture of all antibodies. This unique label, as discussed below, is not limited to any particular mechanism. Rather, it is contemplated that any method that provides a way to identify each reference antibody within the mixture, allowing one to distinguish each reference antibody in the set from every other reference antibody in the set, would be suitable. For example, each reference antibody can be labelled colorimetrically so that the particular color of each antibody in the set is determinable. Alternatively, each reference antibody in the set might be labelled radioactively using differing radioactive isotopes. The reference antibody may be labelled by coupling, linking, or attaching the antibody to a labelled object such as a bead or other surface.

Once each reference antibody in the set has been uniquely labelled, a mixture is formed containing all the reference antibodies. Antigen is added to the mixture, and the probe antibody is added to the mixture. A detection label is necessary in order to detect complexes containing bound probe antibody. A detection label may be a labelled detection antibody or it may be another label that binds to the probe antibody. For example, when a set of human monoclonal antibodies is being tested, a mouse anti-human monoclonal antibody is suitable for use as a detection antibody. The detection label is chosen to be distinct from all other labels in the mixture that are used to label reference antibodies. For example, a labelled detection antibody might be labelled with a unique color, or radioactively labelled, or labelled by a particular fluorescent marker such as phycoerythrin (PE).

The design of an experiment must include selecting conditions such that the detection antibody will only bind to the probe antibody, and will not bind to the reference antibodies. In embodiments in which reference antibodies are coupled to beads or other materials through antibodies, the antibody that couples the reference antibody to the bead (the “capture antibody”) will be the same antibody as the detection antibody. In accordance with this embodiment of the invention, the detection antibody is specifically chosen or modified so that the detection antibody binds only to the probe antibody and does not bind to the reference antibody. By using the same antibody for both detection and capture, each will block one the other from binding to their respective targets. Accordingly, when the capture antibody is bound to the reference antibody, it will block the detection antibody from binding to the same epitope on the reference antibody and producing a false positive result. Antibodies suitable for use as detection antibodies include mouse anti-human IgG2, IgG3, and IgG4 antibodies available from Calbiochem, (Catalog No. 411427, mouse anti-human IgKappa available from Southern Biotechnology Associates, Inc. (Catalog Nos. 9220-01 and 9220-08, and mouse anti-hIgG from PharMingen (Catalog Nos. 555784 and 555785).

Once the labelled detection antibody has been added to the mixture, the entire mixture can then be analyzed to detect complexes between labelled detection antibody, bound probe antibody, the antigen, and uniquely labelled reference antibody. The detection method must permit detection of complexes (or lack thereof) for each uniquely labelled reference antibody in the mixture.

Detecting whether a complex formed between a probe antibody and each reference antibody in the set indicates, for each reference antibody, whether that reference antibody competes with the probe antibody for binding to the same (or nearby) epitope. Because the mixture of reference antibodies will include the antibody being used as the probe antibody, it is expected that this provides a negative control. Detecting complex formation allows measurement of competitive affinities of the antibodies in the set being tested. This measurement of competitive affinities is then used to categorize each antibody in the set based on how strongly or weakly they bind to the same epitopes on the target antigen. This provides a rapid method for grouping antibodies in a set based on their binding characteristics.

In one embodiment, large numbers of antibodies can be simultaneously screened for their epitope recognition properties in a single experiment in accordance with embodiments of the present invention, as described below. Generally, the term “experiment” is used nonexclusively herein to indicate a collection of individual antibody assays and suitable controls. The term “assay” is used nonexclusively herein to refer to individual assays, for example reactions carried out in a single well of a microtiter plate using a single probe antibody, or may be used to refer to a collection of assays or to refer to a method of measuring antibody binding and competition as described herein.

In one embodiment, large numbers of antibodies are simultaneously screened for their epitope recognition properties using a sandwich assay involving a set of reference antibodies in which each reference antibody in the set is bound to a uniquely labelled “capture” antibody. The capture antibody can be, for example, a colorimetrically labelled antibody that has strong affinity for the antibodies in the set. As one example, the capture antibody can be a labelled mouse, goat, or bovine anti-human IgG or anti-human IgKappa antibody. Although embodiments described herein use a mouse monoclonal anti-human IgG antibody, other similar capture antibodies that will bind to the antibodies being studied are within the scope of the invention. Thus, one of skill in the art can select an appropriate capture antibody based on the origin of the set of antibodies being tested.

One embodiment of the present invention therefore provides a method of categorizing, for example, which epitopes on a target antigen are bound by fifty (50) different antibodies generated against that target antigen. Once the 50 antibodies have been determined to have some affinity for a target antigen, the methods described below are used to determine which antibodies in the group of 50 bind to the same epitope. These methods are performed by using each one of the 50 antibodies as a probe antibody to cross-compete against a mixture of all 50 antibodies (the reference antibodies), wherein the 50 uniquely labelled reference antibodies in the mixture are each labelled by a capture antibody. Those antibodies that recognize the same epitope will compete with one another, while antibodies that do not compete are assumed to not bind to the same epitope. By uniquely labelling a large number of antibodies in a single reaction, as described below, these methods allow for a pre-selected antibody to be competed against 10, 25, 50, 100, 200, 300, or more antibodies at one time. For this reason, the choice of testing 50 antibodies in an experiment is arbitrary, and should not be viewed as limiting on the invention.

Preferably, the Multiplex Competitive Antibody Binning (MCAB) assay is used. More preferably, the MCAB assay is practiced utilizing the LUMINEX System (Luminex Corp., Austin Tex.), wherein up to 100 antibodies can be binned simultaneously using the method illustrated in FIG. 1. The MCAB assay is based on the competitive binding of two antibodies to a single antigen molecule. The entire set of antibodies to be characterized is used twice in the MCAB assay, in “capture” and “detect” modes in the MCAB sandwich assay.

In one embodiment, each capture antibody is uniquely labelled. Once a capture antibody has been uniquely labelled, it is exposed to one of the set of antibodies being tested, forming a reference antibody that is uniquely labelled. This is repeated for the remaining antibodies in the set so that each antibody becomes labelled with a different colored capture antibody. For example, when 50 antibodies are being tested, a labelled reference antibody mixture is created by mixing all 50 uniquely labelled reference antibodies into a single reaction well. For this reason, it is useful for each label to have a distinct property that allows it to be distinguished or detected when mixed with other labels. In one preferred embodiment, each capture antibody is labelled with a distinct pattern of fluorochromes so they can be colorimetrically distinguished from one another.

Once the test antibody mixture is created, it is placed into multiple wells of, for example, a microtiter plate. In this example, the same antibody mixture would be placed in each of 50 microtiter wells and the mixture in each well would then be incubated with the target antigen as a first step in the competition assay. After incubation with the target antigen, a single probe antibody selected from the original set of 50 antibodies is added to each well. In this example, only one probe antibody is added to each reference antibody mixture. If any labelled reference antibody in the well binds to the target antigen at the same epitope as the probe antibody, they will compete with one another for the epitope binding site.

It is understood by one of skill in the art that embodiments of the invention are not limited to only adding a single probe antibody to each well. Other methods wherein multiple probe antibodies, each one distinguishably labelled from one another, are added to the mixture are contemplated.

In order to determine whether the probe antibody has bound to any of the 50 labelled reference antibodies in the well, a labelled detection antibody is added to each of the 50 reactions. In one embodiment, the labelled detection antibody is a differentially labelled version of the same antibody used as the capture antibody. Thus, for example, the detection antibody can be a mouse anti-human IgG antibody or a anti-human IgKappa antibody. The detection antibody will bind to, and label, the probe antibody that was placed in the well.

The label on the detection antibody permits detection and measurement of the amount of probe antibody bound to a complex formed by a reference antibody, the antigen, and the probe antibody. This complex serves as a measurement of the competition between the probe antibody and the reference antibody. The detection antibody may be labelled with any suitable label which facilitates detection of the secondary antibody. For example, a detection antibody may be labelled with biotin, which facilitates fluorescent detection of the probe antibody when streptavidin-phycoerythrin (PE) is added. The detection antibody may be labelled with any label that uniquely determines its presence as part of a complex, such as biotin, digoxygenin, lectin, radioisotopes, enzymes, or other labels. If desired, the label may also facilitate isolation of beads or other surfaces with antibody-antigen complexes attached.

The amount of labelled detection antibody bound to each uniquely labelled reference antibody indicates the amount of bound probe antibody, and the labelled detection antibody is bound to the probe antibody bound to antigen bound to labelled reference antibody. Measuring the amount of labelled detection antibody bound to each one of the 50 labelled reference antibodies indicates the amount of bound probe antibody can be obtained, where the amount of bound probe antibody is an indicator of the similarity or dissimilarity of the epitope recognition properties of the two antibodies (probe and reference). If a measurable amount of the labelled detection antibody is detected on the labelled reference antibody-antigen complex, that is understood to indicate that the probe antibody and the reference antibody do not bind to the same epitope on the antigen. Conversely, if little or no measurable detection antibody is detected on the labelled reference antibody-antigen complex, then it is understood to indicate that the probe antibody for that reaction bound to very similar or identical epitopes on the antigen. If a small amount of detection antibody is detected on the reference antibody-antigen complex, that is understood to indicate that the reference and probe antibodies may have similar but not identical epitope recognition properties, e.g., the binding of the reference antibody to its epitope interferes with but does not completely inhibit binding of the probe antibody to its epitope.

Another aspect of the present invention provides a method for detecting both the reference antibody and the amount of probe antibody bound to an antigen. If antibody complexes containing different reference antibodies have been mixed, then the unique property provided by the unique labels on the capture antibody can be used to identify the reference antibody coupled to that bead. Preferably, that distinct property is a unique emission spectrum.

The amount of probe antibody bound to any reference antibody can be determined by measuring the amount of detection label bound to the complex. The detection label may be a labelled detection antibody bound to probe antibody bound to the complex, or it may be a label attached to the probe antibody. Thus, the epitope recognition properties of both a reference antibody and a probe antibody can be measured by using a comparative measure of the competition between the two antibodies for an epitope.

Conditions for optimizing procedures can be determined by empirical methods and knowledge of one of skill in the art. Incubation time, temperature, buffers, reagents, and other factors can be varied until a sufficiently strong or clear signal is obtained. For example, the optimal concentration of various antibodies can be empirically determined by one of skill in the art, by testing antibodies and antigens at different concentrations and looking for the concentration that produces the strongest signal or other desired result. In one embodiment, the optimal concentration of primary and secondary antibodies—that is, antibodies to be binned—is determined by a double titration of two antibodies raised against different epitopes of the same antigen, in the presence of a negative control antibody that does not recognize the antigen.

Assays Using Colored Beads

In a preferred embodiment, large numbers of antibodies are simultaneously screened for their epitope recognition properties in a single assay using color-coded microspheres or beads to identify multiple reactions in a single tube or well, preferably using a system available from Luminex Corporation (Luminex Corp, Austin Tex.), and most preferably using the Luminex 100 system. Preferably, the MCAB assay is carried out using Luminex technology. In another preferred embodiment, up to 100 different antibodies to be tested are bound to Luminex beads with 100 distinct colors. This system provides 100 different sets of polystyrene beads with varying amounts of fluorochromes embedded. This gives each set of beads a distinct fluorescent emission spectrum and hence a distinct color code.

To characterize the binding properties of antibodies using the Luminex 100 system, beads are coated with a capture antibody which is covalently attached to each bead; preferably a mouse anti-human IgG or anti-human IgKappa monoclonal antibody is used. Each set of beads is then incubated in a well containing a reference antibody to be characterized (e.g., containing hybridoma supernatant) such that a complex if formed between the bead, the capture antibody, and the reference antibody (henceforth, a “reference antibody-bead” complex) which has a distinct fluorescence emission spectrum and hence, a color code, that provides a unique label for that reference antibody.

In this preferred embodiment, each reference antibody-bead complex from each reaction with each reference antibody is mixed with other reference antibody-bead complexes to form a mixture containing all the reference antibodies being tested, where each reference antibody is uniquely labelled by being couple to a bead. The mixture is aliquotted into as many wells of a 96-well plate as is necessary for the experiment. Generally, the number of wells will be determined by the number of probe antibodies being tested, along with various controls. Each of these wells containing an aliquot of the mixture of reference antibody-bead complexes is incubated first with antigen and then probe antibody (one of the antibodies to be characterized), and then detection antibody (a labelled version of the original capture antibody), where the detection antibody is used for detection of bound probe antibody. In a preferred embodiment, the detection antibody is a biotinylated mouse anti-human IgG monoclonal antibody. This process is illustrated in FIG. 1.

In the illustrative embodiment presented in FIG. 1, each reference antibody is coupled to a bead with distinct emission spectrum, where the reference antibody is coupled through a mouse anti-human monoclonal capture antibody, forming a uniquely labelled reference antibody. The entire set of uniquely labelled reference antibodies is placed in the well of a multiwell microtiter plate. The set of reference antibodies are incubated with antigen, and then a probe antibody is added to the well. A probe antibody will only bind to antigen that is bound to a reference antibody that recognizes a different epitope. Binding of a probe antibody to antigen will form a complex consisting of a reference antibody coupled to a bead through a capture antibody, the antigen, and the bound probe antibody. A labelled detection antibody is added to detect bound probe antibody. Here, the detection antibody is labelled with biotin, and bound probe antibody is detected by the interaction of streptavidin-PE and the biotinylated detection antibody. As shown in FIG. 1, Antibody #50 is used as the probe antibody, and the reference antibodies are Antibody #50 and Antibody #1. Probe Antibody #50 will bind to antigen that is bound to reference Antibody #1 because the antibodies bind to different epitopes, and a labelled complex can be detected. Probe antibody #50 will not bind to antigen that is bound by reference antibody #50 because both antibodies are competing for the same epitope, such that no labelled complex is formed.

In this embodiment, after the incubation steps are completed, the beads of a given well are aligned in a single file in a cuvette and one bead at a time passes through two lasers. The first laser excites fluorochromes embedded in the beads, identifying which reference antibody is bound to each bead. A second laser excites fluorescent molecules bound to the bead complex, which quantifies the amount of bound detection antibody and hence, the amount of probe antibody bound to the antigen on a reference antibody-bead complex. When a strong signal for the detection antibody is measured on a bead, that indicates the reference and probe antibodies bound to that bead are bound to different sites on the antigen and hence, recognize different epitopes on the antigen. When a weak signal for the bound detection antibody is measured on a bead, that indicates the corresponding reference and probe antibodies compete for the same epitope. This is illustrated in FIG. 1. A key advantage of this embodiment is that it can be carried out in high-throughput mode, such that multiple competition assays can be simultaneously performed in a single well, saving both time and resources.

The assay described herein may include measurements of at least one additional parameter of the epitope recognition properties of primary and secondary antibodies being characterized, for example the effect of temperature, ion concentration, solvents (including detergent) or any other factor of interest. One of skill in the relevant art can use the present disclosure to develop an experimental design that permits the testing of at least one additional factor. If necessary, multiple replicates of an assay may be carried out, in which factors such as temperature, ion concentration, solvent, or others, are varied according to the experimental design. When additional factors are tested, methods of data analysis can be adjusted accordingly to include the additional factors in the analysis.

Data Analysis

Another aspect of the present invention provides processes for analyzing data generated from at least one assay, preferably from at least one high throughput assay, in order to identify antibodies having similar and dissimilar epitope recognition properties. A comparative approach, based on comparing the epitope recognition properties of a collection of antibodies, permits identification of those antibodies having similar epitope recognition properties, which are likely to compete for the same epitope, as well as the identification of those antibodies having dissimilar epitope recognition properties, which are likely to bind to different epitopes. In this way, antibodies can be categorized, or “binned” based on which epitope they recognize. A preferred embodiment provides the Competitive Pattern Recognition (CPR) process for analyzing data generated by a high throughput assay. More preferably, CPR is used to analyze data generated by the Multiplexed Competitive Antibody Binning (MCAB) high-throughput competitive assay. Application of data analysis processes as disclosed and claimed herein makes it possible to eliminate redundancy by identifying the distinct binding specificities represented within a pool of antigen-specific antibodies characterized by an assay such as the MCAB assay.

A preferred embodiment of the present invention provides a process that clusters antibodies into “bins” or categories representing distinct binding specificities for the antigen target. In yet another preferred embodiment, the CPR process is applied to data representing the outcomes of the MCAB high-throughput competition assay in which every antibody competes with every other antibody for binding sites on antigen molecules. Embodiments carried out using different data sets of antibodies generated from XenoMouse® animals provide a demonstration that application of the process of the present invention produces consistent and reproducible results.

The analysis of data generated from an experiment typically involves multi-step operations to normalize data across different wells in which the assay has been carried out and cluster data by identifying and classifying the competition patterns of the antibodies tested. A matrix-based computational process for clustering antibodies is then performed based on the similarity of their competition patterns, wherein the process is applied to classify sets of antibodies, preferably antibodies generated from hybridoma cells.

Antibodies that are clustered based on the similarity of their competition patterns are considered to bind the same epitope or similar epitopes. These clusters may optionally be displayed in matrix format, or in “tree” format as a dendrogram, or in a computer-readable format, or in any data-input-device-compatible format. Information regarding clusters may be captured from a matrix, a dendrogram or by a computer or other computational device. Data capture may be visual, manual, automated, or any combination thereof.

As used herein, the term “bin” may be used as a noun to refer to clusters of antibodies identified as having similar competition according to the methods of the present invention. The term “bin” may also be used a verb to refer to practicing the methods of the present invention. The term “epitope binning assay” as used herein, refers to the competition-based assay described herein, and includes any analysis of data produced by the assay.

Steps in data analysis are described in detail in the following disclosure, and practical guidance is provided by reference to the data and results are presented in Example 2. References to the data of Example 2, especially the matrices or dendrograms generated by performing various data analysis steps on the input data of Example 2, serve merely as illustrations and do not limit the scope of the present invention in any way.

When a large number and sizes of the data sets is generated, a systematic method is needed to analyze the matrices of signal intensities to determine which antibodies have similar signal intensity patterns. By way of example, two matrices containing m rows and m columns are generated in a single experiment, where m is the number of antibodies being examined. One matrix has signal intensities for the set of competition assays in which antigen is present. The second matrix has the corresponding signal intensities for a negative control experiment in which antigen is absent. Each row in a matrix represents a unique well in a multiwell microtiter plate, which identifies a unique probe antibody. Each column represents a unique bead spectral code, which identifies a unique reference antibody. The intensity of signal detected in each cell in a matrix represents the outcome of an individual competition assay involving a reference antibody and a probe antibody. The last row in the matrix corresponds to the well in which blocking buffer is added instead of a probe antibody. Similarly, the last column in the matrix corresponds to the bead spectral code to which blocking buffer is added instead of reference antibody. Blocking buffer serves as a negative control and determines the amount of signal present when only one antibody (of the reference-antibody-probe-antibody pair) is present.

Similar signal intensity value patterns for two rows indicate that the two probe antibodies exhibit similar binding behaviors, and hence likely compete for the same epitope. Likewise, similar signal intensity patterns for two columns indicate that the two reference antibodies exhibit similar binding behaviors, and hence likely compete for the same epitope. Antibodies with dissimilar signal patterns likely bind to different epitopes. Antibodies can be grouped, or “binned,” according to the epitope that they recognize, by grouping together rows with similar signal patterns or by grouping together columns with similar signal patterns. Such an assay described above is referred to as an epitope binning assay.

Program To Apply Competitive Pattern Recognition (CPR) Process

One aspect of the present invention provides a program to apply the CPR process having two main steps: (1) normalization of signal intensities; and (2) generation of dissimilarity matrices and clustering of antibodies based on their normalized signal intensities. It is understood that the term “main step” encompasses multiple steps that may be carried as necessary, depending on the nature of the experimental material used and the nature of the data analysis desired. It is also understood that additional steps may be practiced as part of the present invention.

Background Normalization of Signal Intensities

Input data is subjected to a series of preprocessing steps that improve the ability to detect meaningful patterns. Preferably, the input data comprises signal intensities stored in a two dimensional matrix, and a series of normalization steps are carried out to eliminate sources of noise or signal bias prior to clustering analysis.

The input data to be analyzed comprises the results from a complete assay of epitope recognition properties. Preferably, results comprise signal intensities measured from an assay carried out using labelled secondary antibodies. More preferably, results using the MCAB assay are analyzed as described herein. Two input files are generated: one input file from an assay in which antigen was added; and a second input file from an assay in which antigen was absent. The experiment in which antigen is absent serves as a negative control allowing one to quantify the amount of binding by the labelled antibodies that is not to the antigen. Preferably, each combination of primary antibody and secondary antibody being tested was assayed in the presence and absence of antigen, such that each combination is represented in both sets of input data. Even more preferably, the assay is carried out using the procedures for assaying epitope recognition properties of multiple antibodies using a multi-well format disclosed elsewhere in the present disclosure.

The input data normally comprises signal intensities stored in a two dimensional matrix. First, the matrix corresponding to the experiment without antigen (negative control) experiment, A_(B), is subtracted from the matrix corresponding to the experiment with antigen, A_(E), to give the background normalized matrix given by A_(N)=A_(E)−A_(B). This subtraction step eliminates background signal that is not due to binding of antibodies to antigen. The above matrices are of dimension (m+1)×(m+1) where m is the number of antibodies to be clustered. The last row and the last column contain intensity values for experiments in which blocking buffer was added in place of a probe antibody or reference antibody, respectively.

In an illustrative embodiment, FIGS. 8A and 8B illustrate the intensity matrices generated in the embodiment disclosed in Example 2, which are used as input data matrices for subsequent steps in data analysis. FIG. 8A is the intensity matrix for an experiment conducted with antigen, and FIG. 8B is the intensity matrix for the same experiment conducted without antigen. Each row in the matrix corresponds to the signal intensities for the different beads in one well, where each well represents a unique detecting antibody. Each column represents the signal intensities corresponding to the competition of a unique primary antibody with each of the secondary antibodies. Each cell in the matrix represents an individual competition assay for a different pair of primary and secondary antibodies. In assays of epitope recognition properties, addition of blocking buffer in place of one of the antibodies serves as a negative control. In the embodiment illustrated by FIGS. 8A and 8B, the last row in the matrix corresponds to the well in which blocking buffer is added in place of a secondary antibody, and the last column in the matrix corresponds to the beads to which blocking buffer is added in place of primary antibody. Other arrangements of cells within a matrix can be used to practice aspects of the present invention, as one of skill in the relevant art can design data matrices having other formats and adapt subsequent manipulations of these data matrices to reflect the particular format chosen.

A difference matrix can be generated by subtracting the matrix corresponding to values obtained from the experiment without antigen from the matrix corresponding to values obtained from the experiment with antigen. This step is performed to subtract from the total signal the amount of signal that is not attributed to the binding of the labelled probe antibody to the antigen. This subtraction step generates a difference matrix as illustrated in FIG. 9. Following this subtraction, any antibodies that have unusually high intensities for their diagonal values relative to the other diagonal values are flagged. High values for a column both along and off the diagonal suggest that the data associated with this particular bead may not be reliable. The antibodies corresponding to these columns are flagged at this step and are considered as individual bins.

Elimination of Background Signals Due to Nonspecific Binding: Normalization of Signal Intensities within Rows or Columns of the Matrix.

In some cases, there is a significant disparity in the overall signal intensities between different rows or columns in the background-normalized signal intensity matrix. Row variations are likely due to variations in intensity from well to well, while column variation is likely due to the variation in the affinities and concentrations of different probe antibodies. In accordance with one aspect of the present invention, there is often a linear correlation between the blocking buffer values of the rows or columns, and the average signal intensity values of the rows or columns. If an intensity variation is observed, an additional step of row and/or column normalization is performed as described below.

Row Normalization.

Row normalization is performed when there are any significant well-specific signal biases, and is carried out to eliminate any “signal artifacts” that would otherwise be introduced into the data analysis. One of skill in the art can determine whether the step is desirable based on the distribution of intensity values of the blocking buffer negative controls. By way of illustration, in FIG. 2A, the blocking buffer intensity value for each row is plotted against the average intensity value (excluding the blocking buffer value) for the corresponding row. The plot in FIG. 2A shows a clear linear correlation between the blocking buffer values and the average intensity value for a row. This figure shows that there is a well-specific signal bias in the samples being analyzed, and that the intensity value for the blocking buffer correlates to the overall signal intensity within a row. The different intensity biases seen in the different rows is likely due in part to the variation in affinity for the secondary antibodies for the antigen as well as the concentration variations of these secondary antibodies. Note that FIG. 2B shows that, for the same embodiment, there is weaker correlation between the blocking buffer intensity values for the columns and the average column intensity values.

For intensity variations in rows, the intensities of each row in the matrix are adjusted by dividing each value in a row by the blocking buffer intensity value for that row. In the case where blocking buffer data is absent, each row value is divided by the average intensity value for the row. In an embodiment applying the CPR process, the intensity-normalized matrix is given by

${{A_{I}\left( {i,j} \right)} = {{\frac{A_{N}\left( {i,j} \right)}{I(k)}\mspace{14mu} 1} \leq i}},{j \leq {m + 1}}$

where I is a vector containing the blocking buffer or average intensities and k=i if normalization is done with respect to rows.

Column Normalization.

In this final pre-processing step, each column in the row normalized matrix (that was not flagged at the step the difference matrix was generated) is divided by its corresponding diagonal value. The cells along the diagonal represent competition assays for which the primary and secondary antibodies are the same. Ideally, values along the diagonal should be small as two copies of the same antibody should compete for the same epitope. The division of each column by its corresponding diagonal is done to measure each intensity relative to an intensity that is known to reflect competition—i.e., competition of an antibody against itself.

For intensity variations in columns, the intensities of each column in the matrix are adjusted by dividing each value in a column by the blocking buffer intensity value for that row. In the case where blocking buffer data is absent, each column value is divided by the average intensity value for the column. In an embodiment applying the CPR process, the intensity-normalized matrix is given by

${{A_{I}\left( {i,j} \right)} = {{\frac{A_{N}\left( {i,j} \right)}{I(k)}\mspace{14mu} 1} \leq i}},{j \leq {m + 1}}$

where I is a vector containing the blocking buffer or average intensities and k=j if normalization is done with respect to columns.

Setting Threshold Values Prior to Row or Column Normalization.

To prevent artificial inflation of low signal values in this normalization step, all blocking buffer values that are below a minimum user-defined threshold value are flagged and then adjusted to the user-defined threshold value which represents the lowest reliable signal intensity value, prior to row or column division. This threshold is set based on a histogram of the signal intensities. This normalization step adjusts for variations in intensity from well to well.

By way of example, FIG. 17 illustrates an adjusted difference matrix for the data of Example 2, wherein the minimum reliable signal intensity is set to 200 intensity units. Each row in the matrix is adjusted by dividing it by the last intensity value in the row. As noted above, the last intensity value in each row corresponds to the intensity value for beads to which blocking buffer is added in place of primary antibody. This step adjusts for the well-to-well variation in intensity values across the row. FIG. 18 illustrates a row normalized matrix for the data of Example 2.

Further by way of example, FIG. 2A presents data from an embodiment in which the blocking buffer intensity value for each row was plotted against the average intensity value for the corresponding row. This plot shows a linear correlation between the blocking buffer values and the average intensity value for a row, and suggests that there are well-specific intensity biases. These biases may be partially due to the variation in affinity for the probe antibodies for the antigen and the concentration variations of the probe antibodies. FIG. 2B presents data from an embodiment in which the blocking buffer intensity value for each column was plotted against the average intensity value for the corresponding column.

In another illustrative embodiment, FIG. 2C shows a scatter plot of the background-normalized difference matrix intensities plotted against the intensities for the matrix of results from an embodiment using antigen. This plot shows a tight linear correlation (slope=1) for signal values greater than 1000, and a more scattered correlation for lower signal values. The points in FIG. 2C are shaded according to the value of a fraction calculated as the subtracted signal divided by the signal for the experiment with antigen present. Smaller fraction values (closer to zero) correspond to high background contribution and have light shading in FIG. 2C. Larger fraction values (closer to 1) correspond to lower background contribution and have darker shading. In FIG. 2C, the smaller fraction values are predominantly in the lower-left region of the scatter plot, suggesting that the contribution of background becomes less for subtracted signal values greater than 1000.

The plot shown in FIG. 2C suggests that for this embodiment, intensity values of the background-normalized matrix greater than 1000 have a low background signal contribution relative to the signal due to antigen binding. These matrix cells likely correspond to antibody pairs that do not compete for the same epitope. Conversely, intensity values below 1000 likely correspond to antibody pairs that bind to the same epitope. In accordance with one aspect of the present invention, it is expected that the intensity values along the diagonal would be small, as identical reference and probe antibodies compete for the same epitope. In the embodiment illustrated in FIG. 2C, all but one of the diagonal values of the background-normalized signal intensity matrix have intensity values below 1000.

Normalization of Signal Intensities Relative to the Baseline Signal for Probe Antibodies

In a final step, data are adjusted by dividing each column or row by its corresponding diagonal value to generate the final normalized matrix given by

${A_{F}\left( {i,j} \right)} = {\frac{A_{I}\left( {i,j} \right)}{A_{I}\left( {j,j} \right)}.}$

Once again, to prevent artificial inflation of low signal values in this normalization step, all diagonal values below a minimum user-defined threshold value are adjusted to the threshold value before the diagonal division is done. This step is done for all columns or rows, except those that have diagonal values that are significantly high relative to other values in the column or row. This step normalizes each intensity value relative to the intensity corresponding to the individual competition assay for which the reference and probe antibodies are the same. This intensity value should be low and ideally reflect the baseline signal intensity value for the column or row, because two identical antibodies should compete for the same epitope and hence be unable to simultaneously bind to the same antigen. Columns having unusually large diagonal values are identified as outliers and excluded from the analysis. High-diagonal-intensity values may indicate that the antigen has two copies of the same epitope, e.g., when the antigen is a homodimer.

Pattern Recognition Analysis: Dissimilarity Matrices

In accordance with another aspect of the present invention, a second step in data analysis involves generating a dissimilarity matrix from the normalized intensity matrix in two steps. First, the normalized intensity values that are below a user-defined threshold value for background are set to zero (and hence represent competition) and the remaining values are set to 1, indicating that the antibodies bind to two different epitopes. Accordingly, intensity values that are less than the intensity equal to this threshold multiplied by the intensity value of the diagonal value are considered low enough to represent competition for the same epitope by the antibody pair. The dissimilarity matrix or distance matrix for a given threshold value is computed from the matrix of zeroes and ones by determining the number of positions in which each pair of rows differs. The entry in row i and column j, corresponds to the fraction of the total number of primary antibodies that differ in their competition patterns with the secondary antibodies represented in rows i and j.

By way of example, FIG. 14 shows the number of positions (out of 22 total) at which the patterns for any two antibodies differ. In this embodiment, dissimilarities are computed with respect to rows instead of columns because the row intensities have already been adjusted for well-specific intensity biases and therefore the undesirable effects of unequal secondary antibody affinities and concentrations have been factored out. In addition, the concentrations and affinities of primary antibodies are consistent between rows. However, for the columns, there is not an apparent consistent trend between average intensity and background intensity which suggests that there is not an obvious way to factor out the undesirable affects of the variable primary antibody concentrations and affinities. Therefore, comparing the signals between columns might be less valid.

Dissimilarity Matrix Using CPR.

In an embodiment applying the CPR process, a threshold matrix, A_(T), of zeros and ones is generated as described below. Normalized values that are less than or equal to a threshold value are set to zero to indicate that the corresponding pairs of antibodies compete for the same epitope. The threshold matrix is given by

${A_{T}\left( {i,j} \right)} = \left\{ \begin{matrix} {{0\mspace{14mu} {if}\mspace{14mu} {A_{F}\left( {i,j} \right)}} \leq T} \\ {{1\mspace{14mu} {if}\mspace{14mu} {A_{F}\left( {i,j} \right)}} > {T.}} \end{matrix} \right.$

The remaining normalized intensity values are set to one, and the values represent pairs of antibodies that bind to different epitopes.

The dissimilarity matrix is computed from the threshold matrix by setting the value in the i^(th) row and j^(th) column of the dissimilarity matrix to the fraction of the positions at which two rows, i and j of the matrix of zeros and ones, differ. A dissimilarity matrix for a specified threshold value, T, is given by

${D_{T}\left( {i,j} \right)} = \frac{m - {N_{1}\left( {i,j} \right)}}{m}$

where N₁ is the number of ones (1s) present when the i^(th) and j^(th) rows are summed.

By way of example, for the matrix shown in Table 1 below, the dissimilarity value corresponding to the first and second rows is 0.4, because the number of positions at which the two rows differ is 2 out of 5. For an ideal experiment, the dissimilarity matrix that is generated based on a comparison of rows of the original signal intensity matrix, should be the same as the dissimilarity matrix that is generated based on the comparison of columns.

TABLE 1 Matrix Used to Compute Dissimilarity Values A B C D E A 0 1 1 1 0 B 1 1 1 0 0 C 1 1 1 1 1 D 1 1 1 0 1 E 1 0 1 1 0

Effect of Calculating Dissimilarity Matrices at Multiple Threshold Values.

If desired, the process of generating dissimilarity matrices is repeated for background threshold values incremented inclusively between two user-defined threshold values which represent lower and upper threshold values for intensity (where the threshold value is as described above). The dissimilarity matrices generated over a range of background threshold values is averaged and used an input to the clustering algorithm. The process of averaging over several thresholds is performed to minimize the sensitivity of the final dissimilarity matrix to any one particular choice for the threshold value. The effect of variation of the threshold value on the apparent dissimilarity is illustrated by FIG. 4, which shows the fraction of dissimilarities for a pair of antibodies (2.1 and 2.25) as a function of the threshold value for threshold values ranging between 1.5 and 2.5. As the threshold value changes from 1.8 to 1.9 the amount of dissimilarity between the signal patterns for the two antibodies changes substantially from 15% to nearly 0%. This figure shows how the amount of dissimilarity between the signal patterns for a pair of antibodies may be sensitive to one particular choice for a cutoff value, as it can vary substantially for different threshold values. The sensitivity is mitigated by taking the average dissimilarity value over a range of different threshold values.

Calculating Dissimilarity Matrices at Multiple Threshold Values Using CPR.

In a preferred embodiment, the process of computing dissimilarity matrices using CPR is repeated for several incremental threshold values within a user-defined range of values. The average of these dissimilarity matrices is computed and used as input to the clustering step where the average is computed as

${{D_{Ave}\left( {i,j} \right)} = \frac{\sum\limits_{T}{D_{T}\left( {i,j} \right)}}{N_{T}}}\;$

where N_(T) is the number of different thresholds to be averaged.

This process of averaging over several thresholds is done to minimize the sensitivity of the dissimilarity matrix to a particular cutoff value for the threshold.

Dissimilarity Matrices from Multiple Experiments

If there are input data sets for more than one experiment, normalized intensity matrices are first generated as described above for each individual experiment. Normalized values above a threshold value (typically set to 4) are then set to this threshold value. Setting the high-intensity values to the threshold value is done to prevent any single intensity value from having too much weight when the average normalized intensity values are computed for that cell. The average intensity matrix is computed by taking individual averages over all data points for each antibody pair out the group consisting of antibodies that are in at least one of the input data sets. Antibody pairs for which there are no intensity values are flagged. The generation of the dissimilarity matrix is as described above with the exception that the entry in row i and column j corresponds to the fraction of the positions at which two rows, i and j differ out of the total number of positions for which both rows have an intensity value. If the two rows have no such positions, then the dissimilarity value is set arbitrarily high and flagged.

Clustering of Antibodies Based on their Normalized Signal Intensities

Another aspect of the present invention provides processes for clustering antibodies based on their normalized signal intensities, using various computational approaches to identify underlying patterns in complex data. Preferably, any such process utilizes computational approaches developed for clustering points in multidimensional space. These processes can be directly applied to experimental data to determine epitope binding patterns of sets of antibodies by regarding the signal levels for the n² competition assays of n probe antibodies in n sampled reference antibodies as defining n points in n-dimensional space. These methods can be directly applied to epitope binning by regarding the signal levels for the competition assays of each secondary antibody with all of the n different primary antibodies as defining a point in n-dimensional space.

Results of clustering analysis can be expressed using visual displays. In addition or in the alternative, the results of clustering analysis can be captured and stored independently of any visual display. Visual displays are useful for communicating the results of an epitope binning assay to at least one person. Visual displays may also be used as a means for providing quantitative data for capture and storage. In one preferred embodiment, clusters are displayed in a matrix format and information regarding clusters is captured from a matrix. Cells of a matrix can have different intensities of shading or patterning to indicate the numerical value of each cell; alternately, cells of a matrix can be color-coded to indicate the numerical value of each cell. In another preferred embodiment, clusters are displayed as dendrograms or “trees” and information regarding clusters is captured from a dendrogram based on branch length and height (distance) of branches. In yet another preferred embodiment, clusters are identified by automated means, and information regarding clusters is captured by an automated data analysis process using a computer or any data input device.

One approach that has proven valuable for the analysis of large biological data sets is hierarchical clustering (Eisen et al. (1998) Proc. Natl. Acad. Sci. USA 95:14863-14868). Applying this method, antibodies can be forced into a strict hierarchy of nested subsets based on their dissimilarity values. In an illustrative embodiment, the pair of antibodies with the lowest dissimilarity value is grouped together first. The pair or cluster(s) of antibodies with the next smallest dissimilarity (or average dissimilarity) value is grouped together next. This process is iteratively repeated until one cluster remains. In this manner, the antibodies are grouped according to how similar their competition patterns are, compared with the other antibodies. In one embodiment, antibodies are grouped into a dendrogram (sometimes called a “phylogenetic tree”) whose branch lengths represent the degree of similarity between the binding patterns of the two antibodies. Long branch lengths between two antibodies indicate they likely bind to different epitopes. Short branch lengths indicate that two antibodies likely compete for the same epitope.

In a preferred embodiment, the antibodies corresponding to the rows in the matrix are clustered by hierarchical clustering based on the values in the average dissimilarity matrix using an agglomerative nesting subroutine incorporating the Manhattan metric with an input dissimilarity matrix of the average dissimilarity matrix. In an especially preferred embodiment, antibodies are clustered by hierarchical clustering based on the values in the average dissimilarity matrix using the SPLUS 2000 agglomerative nesting subroutine using the Manhattan metric with an input dissimilarity matrix of the average dissimilarity matrix. (SPLUS 2000 Statistical Analysis Software, Insightful Corporation, Seattle, Wash.)

In accordance with another aspect of the present invention, the degree of similarity between two dendrograms provides a measure of the self-consistency of the analyses performed by a program applying the CPR process. A non-limiting theory regarding similarity and consistency predicts that a dendrogram generated by clustering rows and a dendrogram generated by clustering columns of the same background-normalized signal intensity matrix should be identical, or nearly so, because: if Antibody #1 and Antibody #2 compete for the same epitope, then the intensity should be low when Antibody #1 is the reference antibody and Antibody #2 is the probe antibody, as well as when Antibody #2 is the reference antibody and Antibody #1 is the probe antibody. Likewise, when the two antibodies bind to different epitopes, the intensities should be uniformly high. By this reasoning, the degree of similarity between two rows of the signal intensity matrix should be the same as between two columns of the similarity matrix. A high level of self-consistency between row clustering and column clustering suggests that, for a given experiment, the experimental protocol described herein, practiced with the program for applying the process of the present invention, produces robust results.

In accordance with a further aspect of the present invention, the degree of overlap between two epitopes may also be inferred based on the lengths of the longest branches connecting clusters in a dendrogram. For example, if a target antigen has two distinct, completely nonoverlapping epitopes, then one would expect that an antibody binding to one of the epitopes would have an opposite signal intensity pattern from an antibody binding to another epitope. According to this reasoning, if the binding sites are nonoverlapping, the signal patterns for the set of antibodies binding one epitope should be completely anticorrelated to the signal pattern for the set of antibodies recognizing the other epitope. Hence, dissimilarity values that are close to one (1) for two different clusters suggest that the corresponding epitopes do not interfere with each other or overlap in their binding sites on the antigen.

The embodiment described in Example 2 below demonstrates how clustering results can be displayed as a dendrogram (FIG. 5) or in matrix form (FIGS. 16 and 17). The data points (values of antibodies against the ANTIGEN14 target) were grouped into a dendrogram whose branch lengths represent the degree of similarity between two antibodies, where the dendrogram was generated using the Agglomerative Nesting module of the SPLUS 2000 statistical analysis software. To facilitate comparison, In FIGS. 16 and 17, the order of the antibodies in rows and columns of the matrices is the same as the order of the antibodies as displayed from left to right under the dendrogram in FIG. 5. The individual cells are visually coded by assigning a fill pattern to cells according to their numerical value. In FIG. 16, cells with values below a lower threshold value have forward hatching. Cells with values below a lower threshold and an upper threshold have no fill pattern. Cells with values above the upper threshold have stippling. A block having cells that have no fill pattern or have forward hatching indicates that all of the antibodies corresponding to that block that recognize the same epitope. Cells with stippling correspond to antibodies that recognize different epitopes. In FIG. 17, the cells are the normalized intensity values and are also visually coded according to their value. Cells that have forward hatching have intensities below a lower threshold, cells with no fill pattern have intensities between a lower and an upper threshold, while cells with back hatching have intensities above an upper threshold. A cell with forward hatching indicates the antibodies in its corresponding row and column compete for the same epitope (as the intensity is low). A cell with back hatching corresponds to a higher intensity and is indicative that the antibodies in the corresponding row and column bind to different epitopes.

The results from this illustrative embodiment (Example 2) indicate that the processes of the present invention provide a high level of self-consistency for the data with regard to revealing whether or not two antibodies compete for the same epitope. The symmetry of the fill patterns in FIGS. 16 and 17 with respect to the diagonal clearly shows this self-consistency. The reason is that the antibodies in row A and column B are the same pair as in row B and column A. Hence, if the pair of antibodies compete for the same epitope, then the intensity should be low both when antibody A is the primary antibody and antibody B is the secondary antibody, as well as when antibody B is the primary antibody and antibody B is the secondary antibody. Therefore, the intensity for the cell of the ith row and jth column as well that for the jth row and ith column should both be low. Likewise, if these two antibodies recognize different epitopes, then both corresponding intensities should be high. Out of the approximately 200 pairs of cells in FIG. 17, only one pair showed a discrepancy where one member of the pair had an intensity below 1.5 while the other member had an intensity above 2.5. The level of self-consistency of the resulting normalized matrices produced by the algorithm provides a measure of the reliability of both the data generated as well as the algorithm's analysis of the data. The high level of self-consistency for the data set (over 99%) of antibodies against the ANTIGEN14 target suggest that the data analysis processes disclosed and claimed herein generate reliable results.

Clustering Antibodies from Multiple Experiments.

Another aspect of the present invention provides a method for combining data sets to overcome limitations of experimental systems used to screen antibodies. By performing multiple experiments in which each experiment has at least x antibodies in common with each other experiment, and providing the multiple resulting data sets as input to the clustering process, it should be possible to reliably cluster very large numbers of antibodies. By having a set of m antibodies in common between the m experiments, it becomes possible to infer which cluster antibodies are likely to belong to even if they are not tested against every other antibody. This suggests that using this method for data analysis with multiple data sets, it may be possible to achieve an even higher throughput with fewer assays

By way of example, the Luminex technology provides 100 unique fluorochromes, so it is possible to study 100 antibodies at most in a single experiment. The consistency of results produced by the clustering step for individual data sets and the combined data set indicate that it is possible to infer which epitope is recognized by which antibody, even if the epitope and/or antibody are not tested against every other antibody. In a preferred embodiment, the CPR process can be used to characterize the binding patterns of more than 100 antibodies by performing multiple experiments using overlapping antibody sets. By designing experiments in such a way that each experiment has a set of antibodies in common with the other experiments, the combined-average matrix will not have any missing data.

A further aspect provides that the results of data analysis for a given set of antibodies are useful to aid in the rational design of subsequent experiments. For example, if a data set for a first experiment shows well-defined clusters emerging, then the set of antibodies for a second experiment should include representative antibodies from the first set of antibodies as well as untested antibodies. This approach ensures that each set of antibodies has sufficient material to define the two epitopes, and that the sets overlap sufficiently to permit comparison between sets. By comparing the competition patterns of an untested set of antibodies in the second experiment with a sample set of known antibodies from the first experiment, it should be possible to determine whether or not the untested antibodies recognize the same epitope(s) as do the first set of antibodies. This overlapping experimental design permits reliable comparison of the competition patterns of the first set with the second set of antibodies, to determine whether the antibodies in the second experiment recognize existing epitopes, or whether they recognize one or more completely novel epitopes. Further, experiments can be iteratively designed in an optimal way, so that multiple sets of antibodies can be tested against existing and new clusters.

Analysis of Data from Multiple Experiments.

Results from the embodiment described in Example 3 below, using antibodies against the ANTIGEN39 target, demonstrate that the processes disclosed and claimed herein are suitable for analyzing data from multiple experiments. In this embodiment, ANTIGEN39 antibodies were tested for binding to cell surface ANTIGEN39 antigen, where ANTIGEN39 antigen is a cell surface protein. First, normalized intensity matrices were generated for each individual experiment, wherein normalized values above a selected threshold value are set to the selected threshold value to prevent any single normalized intensity value from having too much influence on the average value for that antibody pair. A single normalized matrix was generated from the individual normalized matrices by taking the average of the normalized intensity values over all experiments for each antibody pair for which data was available. Then a single dissimilarity matrix was generated as described above, with the exception that the fraction of the positions at which two rows, i and j differ only considers the number of positions for which both rows have an intensity value.

For five experiments using ANTIGEN39 antibodies, the clustering results for the five input data sets showed that there were a large number of clusters of varying degree of similarity, suggesting the presence of several different epitopes, some of which may overlap. This is shown in FIG. 6A, FIG. 18, FIG. 19, and FIG. 30. For example, the cluster containing antibodies 1.17, 1.55, 1.16, 1.11, and 1.12 and the cluster containing 1.21, 2.12, 2.38, 2.35, and 2.1 are fairly closely related, as each antibody pair shows no more than 25% difference, with the exception of 2.35 and 1.11. This high degree of similarity across the two clusters suggested that the two different epitopes may have a high degree of similarity

The five data sets from separate experiments using ANTIGEN39 antibodies were also independently clustered, to demonstrate that the processes disclosed and claimed herein produce consistent clustering results. Clustering results are summarized in FIGS. 6B-6F and in FIGS. 20-30, where FIG. 30 summarizes the clusters for each of the individual data sets and for the combined data set with all of the antibodies for the five experiments. FIG. 6B shows the dendrogram for the ANTIGEN39 antibodies for Experiment 1: Antibodies 1.12, 1.63, 1.17, 1.55, and 2.12 consistently clustered together in this experiment as well as in other experiments as do antibodies 1.46, 1.31, 2.17, and 1.29. FIG. 6C shows the dendrogram for the ANTIGEN39 antibodies for Experiment 2: Antibodies 1.57 and 1.61 consistently clustered together in this experiment as well as in other experiments.

FIG. 6D shows the dendrogram for the ANTIGEN39 antibodies for Experiment 3: Antibodies 1.55, 1.12, 1.17, 2.12, 1.11, and 1.21 consistently clustered together in this experiment as well as in other experiments. FIG. 6E shows the dendrogram for the ANTIGEN39 antibodies for experiment 4: Antibodies 1.17, 1.16, 1.55, 1.11, and 1.12 consistently clustered together in this experiment as well as in other experiments as do antibodies 1.31, 1.46, 1.65, and 1.29, as well as antibodies 1.57 and 1.61. FIG. 6F shows the dendrogram for the ANTIGEN39 antibodies for experiment 5: Antibodies 1.21, 1.12, 2.12, 2.38, 2.35, and 2.1 consistently clustered together in this experiment as well as in other experiments.

In general, the clustering algorithm produced consistent results both among the individual experiments and between the combined and individual data sets. Antibodies which cluster together or are in neighboring clusters for multiple individual data sets also cluster together or be in neighboring clusters for the combined data set. For example, cells having lighter shading indicate antibodies that consistently clustered together in the combined data set and in all of the data sets in which they were present (Experiments 1, 3, 4, and 5). These results indicate that the algorithm produces consistent clustering results both across multiple individual experiments and that it retains the consistency upon the merging of multiple data sets.

Finally, there is a high level of self-consistency for the data with regard to revealing whether or not two antibodies compete for the same epitope. The percent of antibody pairs for which the data consistently reveals whether or not they compete for the same epitope is summarized for each data set in Table 2, below, which reveals that the consistency was nearly 90% for four out of the five individual data sets as well as for the combined data set.

TABLE 2 Percent Consistency Values for ANTIGEN39 Antibody Experiments Experiment % Consistency 1 92 2 82 3 88 4 92 5 88 Combined 88

Consistency of Epitope Binning Results with Flow Cytometry (FACS) Results

Results from the embodiment described in Example 3 below, using antibodies against the ANTIGEN39 target further demonstrate that results generated by epitope binning according to the methods of the present invention are consistent with the results generated using flow cytometry (fluorescence-activated cell sorter, FACS). Cells expressing ANTIGEN39 were sorted by FACS, and ANTIGEN39-negative cells were used as negative controls also sorted by FACS. The cell surface binding sites recognized by antibodies from different bins represent different epitopes. FIG. 3 shows a comparison of results from antibody experiments using the anti-ANTIGEN39 antibody, with results using FACS. As shown in FIG. 3, the antibodies in a given bin are either all positive (Bins 1,4,5) or all negative (bins 2 and 3) in FACS, which indicates that the antibody epitope binning assay indeed bins antibodies based on their epitope binding properties. Thus, epitope binning, as described herein, provides an efficient, rapid, and reliable method for determining the epitope recognition properties of antibodies, and sorting and categorizing antibodies based on the epitope they recognize.

Alternative Data Analysis Process and Consistency of Epitope Binning with Sequence Results.

An alternative data analysis process involves subtracting the data matrix for the experiment carried out with antigen from the data matrix for the experiment without antigen to generate a normalized background intensity matrix. The value in each diagonal cell is then used as a background value for determining the binding affinity of the antibody in the corresponding column. Cells in each column the normalized background intensity matrix (the subtracted matrix) having values significantly higher than the value of the diagonal cell for that column are highlighted or otherwise noted. Generally, a value of about two times the corresponding diagonal is considered “significantly higher”, although one of skill in the art can determine what increase over background is the threshold for “significantly higher” in a particular embodiment, taking into account the reagents and conditions used, and the “noisiness” of the input data. Columns with similar binding patterns are grouped as a bin, and minor differences within the bin are identified as sub-bins. This data analysis can be carried out automatically for a given set of input data. For example, input data can be stored in a computer database application where the cells in diagonal are automatically marked, and the cells in each column as compared with the numbers in diagonal are highlighted, and columns with similar binding patterns are grouped.

In a preferred embodiment using fifty-two (52) antibodies against ANTIGEN54, binning results using the data analysis process described above correlated with sequence analysis the CDR regions of antibodies binned using the MCAB competitive antibody assay. The 52 antibodies consisted of 2 or 3 clones from 20 cell lines. As expected, sequences of clones from same line were identical, so only one representative clone from each line was sequenced. The correspondence between the epitope binning results and sequence analysis of antibodies binned by this method indicates this approach is suitable for identifying antibodies having similar binding patterns. In addition, correspondence between the epitope binning results and sequence analysis of antibodies binned by this method means that the epitope binning method provides information and guidance about which antibody sequences are important in determining the epitope specificity of antibody binding.

EXAMPLES Example 1 Assay of Epitope Recognition Properties Generation and Preliminary Characterization of Antibodies.

Hybridoma supernatants containing antigen-specific human IgG monoclonal antibodies used for binning were collected from cultured hybridoma cells that had been transferred from fusion plates to 24-well plates. Supernatant was collected from 24-well plates for binning analysis. Antibodies specific for the antigen of interest were selected by hybridoma screening, using ELISA screening against their antigens. Antibodies positive for binding to the antigen were ranked by their binding affinity through a combination of a 96-well plate affinity ranking method and BIAcore affinity measurement. Antibodies with high affinity for the antigen of interest were selected for epitope binning. These antibodies will be used as the reference and probe test antibodies in the assay.

Assay Using Luminex Beads

First, the concentration of mouse anti-human IgG (mxhIgG) monoclonal antibodies used as capture antibody to capture the reference antibody was measured, and mxhIgG antibodies were dialyzed in PBS to remove azides or other preservatives that could interfere with the coupling process. Then the mxhIgG antibodies were coupled to Luminex beads (Luminex 100 System, Luminex Corp., Austin Tex.) according to manufacturer's instructions in the Luminex User Manual, pages 75-76. Briefly, mxhIgG capture antibody at 50 μg/ml in 500 μl PBS was combined with beads at 1.25×10⁷ beads/ml in 300 μl. After coupling, beads were counted using a hemocytometer and the concentration was adjusted to 1×10⁷ beads/ml.

The antigen-specific antibodies were collected and screened as described above, and their concentrations were determined. Up to 100 antibodies were selected for epitope binning. The antibodies were diluted according to the following formula for linking the antibodies to up to 100 uniquely labelled beads to form labelled reference antibodies:

Total volume of the samples in each tube: Vt=(n+1)×100μl+150μl,

where n=total number of samples including controls.

Volume of individual sample needed for dilution: Vs=C×Vt/Cs,

Cs=IgG concentration of each sample. C=0.2-0.5 μg/ml.

Samples were prepared according to the above formula, and 150 μl of each diluted sample containing a reference antibody was aliquotted into a well of a 96-well plate. Additional aliquots were retained for use as a probe antibody at a later stage in the assay. The stock of mxhIgG-coupled beads was vortexed and diluted to a concentration of 2500 of each bead per well or 0.5×10⁵/ml. The reference antibodies were incubated with mxhIgG-coupled beads on a shaker in the dark at room temperature overnight.

A 96-well filter plate was pre-wetted by adding 200 μl wash buffer and aspirating. Following overnight incubation, beads (now with reference antibodies bound to mxhIgG bound to beads) were pooled, and 100 μl was aliquotted into each well of a 96-well microtiter filter plate at a concentration of 2000 beads per well. The total number of aliquots of beads was twice the number of samples to be tested, thereby permitting parallel experiments with and without antigen. Buffer was immediately aspirated to remove any unbound reference antibody, and beads were washed three times.

Antigen was added (50 μl) to one set of samples; and beads were incubated with antigen at a concentration of 1 μg/ml for one hour. A buffer control was added to the other set of samples, to provide a negative control without antigen.

All antibodies being used as probe antibodies were then added to all samples (with antigen, and without antigen). In this experiment, each antibody being used as a reference antibody was also used as a probe antibody, in order to test all combinations. The probe antibody should be taken from the same diluted solution as the reference antibody, to ensure that the antibody is used at the same concentration. Probe antibody (50 μl/well) was added to all samples and mixtures were incubated in the dark for 2 hours at room temperature on a shaker. Samples were washed three times to remove unbound probe antibody.

Detection Antibody:

Biotinylated mxhIgG (50 μl/well) was added at a 1:500 dilution, and the mixture was incubated in the dark for 1 hour on a shaker. Beads were washed three times to remove unbound Biotinylated mxhIgG. Streptavidin-PE at 1:500 dilution was added, 50 μl/well. The mixture was incubated in the dark for 15 minutes at room temperature on a shaker, and then washed three times to remove unbound components.

In accordance with manufacturer's instructions, the Luminex 100 and XYP base were warmed up using Luminex software. A new session was initiated, and the number of samples and the designation numbers of the beads used in the assay were entered.

Beads in each well were resuspended in 80 μl dilution buffer. The 96-well plate was placed in the Luminex based and the fluorescence emission spectrum of each well was read and recorded.

Optimization of Assay

To optimize the assay, the Luminex User's Manual Version 1.0 was initially used for guidance regarding the concentrations of beads, antibodies, and incubation times. It was determined empirically that a longer incubation time provided assured binding saturation and was more suitable for the nanogram antibody concentrations used in the assay.

Example 2 Analysis of a Single Data Set ANTIGEN14 Antibodies Data Input

Antibodies were assayed as described in Example 1, and results were collected. Input files consisted of input matrices shown in FIG. 8A (antigen present) and FIG. 8B (antigen absent) for a data set corresponding to a single experiment for the ANTIGEN14 target.

Normalization of ANTIGEN14 Target Data

First, the matrix corresponding to the experiment without antigen (negative control, FIG. 8B) experiment was subtracted from the matrix corresponding to the experiment with antigen (FIG. 8A), to eliminate the amount of background signal due to nonspecific binding of the labelled antibody. The difference between the two matrices is shown in FIG. 9. The column corresponding to antibody 2.42 has unusually large values both on and off the diagonal and was flagged and treated separately in the data analysis as described above.

Row Normalization

The difference matrix was adjusted by setting values below the user-defined threshold value of 200 to this threshold value as shown in FIG. 10. This adjustment was done to prevent significant artificial inflation of low signal values in subsequent normalization steps (as described above). The intensities of each row in the matrix were then normalized by dividing each row value by the row value corresponding to blocking buffer (FIG. 11). This adjusts for the well-to-well intensity variation as discussed above and illustrated in FIG. 2A.

Column Normalization

All columns except the one corresponding to antibody 2.42 were column-normalized as described above and are shown in FIG. 12.

Dissimilarity Matrix

A dissimilarity (or distance) matrix was generated in a multistep procedure. First, intensity values below the user-defined threshold (set to two times the diagonal intensity values) were set to zero and the remaining values were set to one (FIG. 13). This means that intensity values that are less than twice the intensity value of the diagonal value are considered low enough to represent competition for the same epitope by the antibody pair. The dissimilarity matrix is generated from the matrix of zeroes and ones by setting the entry in row i and column j to the fraction of the positions at which two rows, i and j differ. FIG. 14 shows the number of positions (out of 22 total) at which the patterns for any two antibodies differed for the set of antibodies generated against the ANTIGEN14 target.

A dissimilarity matrix was generated from the matrix of zeroes and ones generated from each of several threshold values ranging from 1.5 to 2.5 (times the values of the diagonals), in increments of 0.1. The average of these dissimilarity matrices was computed (FIG. 15) and used as input to the clustering algorithm. The significance of taking the average of several dissimilarity matrices is illustrated in FIG. 4. FIG. 4 shows the fraction of dissimilarities for a pair of antibodies (2.1 and 2.25) as a function of the threshold value for threshold values ranging from 1.5 to 2.5. As the threshold value changed from 1.8 and 1.9 the amount of dissimilarity between the signal patterns for the two antibodies changed substantially from 0% to nearly 15%. This figure shows how the amount of dissimilarity between the signal patterns for a pair of antibodies may be sensitive to one particular choice of cutoff value, as it can vary substantially for different threshold values.

Clustering:

Hierarchical Clustering.

Using the Agglomerative Nesting Subroutine in SPLUS 2000 statistical analysis software, antibodies were grouped (or clustered) using the average dissimilarity matrix described above as input. In this algorithm, antibodies were forced into a strict hierarchy of nested subsets. The pair of antibodies with the smallest corresponding dissimilarity value in the entire matrix is grouped together first. Then, the pair of antibodies, or antibody-cluster, with the second smallest dissimilarity (or average dissimilarity) value is grouped together next. This process was iteratively repeated until one cluster remained.

Visualizing Clusters in Dendrograms

The dendrogram calculated for the ANTIGEN14 target is shown in FIG. 5. The length (or height) of the branches connecting two antibodies is inversely proportional to the degree of similarity between the antibodies it binds. This dendrogram shows that there were two very distinct epitopes recognized by these antibodies. One epitope was recognized by antibodies 2.73, 2.4, 2.16, 2.15, 2.69, 2.19, 2.45, 2.1, and 2.25. A different epitope was recognized by antibodies 2.13, 2.78, 2.24, 2.7, 2.76, 2.61, 2.12, 2.55, 2.31, 2.56, and 2.39. Antibody 2.42 does not have a pattern that was very similar to any other antibody but had some noticeable similarity to the second cluster, indicating that it may recognize yet a third epitope which partially overlaps with the second epitope.

Visualizing Clusters in Matrices

This clustering of these antibodies can also be seen in FIG. 16 and FIG. 17. In FIG. 16 the rows and columns of the dissimilarity matrix were rearranged according to the order of the “leaves” or clades on the dendrogram and the individual cells were visually coded according to the degree of dissimilarity. Cells that have forward hatching correspond to antibody pairs that were very similar (less than 10% dissimilar). Cells that have no fill pattern correspond to those antibodies that were fairly similar (between 10% and 25% dissimilar). Cells that have stippling correspond to antibody pairs that were more than 25% dissimilar. The forward hatched blocks correspond to different clusters of antibodies. Excluding the blocking buffer, there appeared to be two, or possibly three, blocks corresponding to the groups of antibodies mentioned above. FIG. 16 also shows that, allowing for a slightly higher tolerance for dissimilarity, Antibody 2.42 can be considered a member of the second cluster.

In FIG. 17, the rows and columns of the normalized intensity matrix were rearranged according to the order of the leaves on the dendrogram and the individual cells were visually coded according to their normalized intensity values. Cells that have forward hatching correspond to antibody pairs that had a high intensity (at least 2.5 times greater than the background). Cells that have no fill pattern had an intensity between 1.5 and 2.5 times the background. Cells that have forward hatching correspond to intensities that were less than 1.5 times the background. When comparing the visual markings of the rows of this matrix, two very distinct patterns emerged corresponding to the two epitopes shown above. Furthermore, note that the visual coding is very symmetric with respect to the diagonal. This shows that there was a high level of self-consistency for the data with regard to revealing whether two antibodies compete for the same epitope. The reason is that if antibody A and antibody B compete for the same epitope, then the intensity should be low both when antibody A is the primary antibody and antibody B is the secondary antibody, as well as when antibody B is the primary antibody and antibody B is the secondary antibody. Therefore, the intensity for the cell of the i^(th) row and j^(th) column as well that for the j^(th) row and i^(th) column should both be low. Likewise, if these two antibodies recognized different epitopes, then both corresponding intensities should have been high. Out of the approximately 200 pairs of cells, for only one pair did one member of the pair have an intensity below 1.5 while the other member had an intensity above 2.5. The level of self-consistency of the resulting normalized matrices produced by the algorithm provided a measure of the reliability of both the data generated as well as the algorithm's analysis of the data. The high level of self-consistency for the ANTIGEN14 data set (over 99%) suggests that one can trust the results of the algorithm for this data set with a high level of confidence.

Example 3 Analysis of Multiple Data Sets ANTIGEN39

When there are input data sets for more than one experiment, normalized intensity matrices are first generated as described above for each individual experiment. Normalized values above a threshold value (typically set to 4) are set to the corresponding threshold value. This prevents any single normalized intensity value from having too much influence on the average value for that antibody pair. A single normalized matrix is generated from the individual normalized matrices by taking the average of the normalized intensity values over all experiments for each antibody pair for which there is data. Antibody pairs with no corresponding intensity values are flagged. The generation of the dissimilarity matrix is as described above with the exception that the fraction of the positions at which two rows, i and j differ only considers the number of positions for which both rows have an intensity value. If the two rows have no such positions, then the dissimilarity value is set arbitrarily high and flagged.

Five experiments were conducted using ANTIGEN39 antibodies, using methods described in Examples 1 and 2, and throughout the description. The clustering results for the five input data sets of ANTIGEN39 antibodies are summarized in FIG. 6A, FIG. 18, FIG. 19, and Table 30. The results show that there were a large number of clusters of varying degree of similarity. This suggests there were several different epitopes, some of which may overlap. For example, the cluster containing antibodies 1.17, 1.55, 1.16, 1.11, and 1.12 and the cluster containing 1.21, 2.12, 2.38, 2.35, and 2.1 are fairly closely related (each antibody pair with the exception of 2.35 and 1.11 being no more than 25% different). This high degree of similarity across the two clusters suggests that the two different epitopes may have a high degree of similarity

In order to test the algorithm's ability to produce consistent clustering results, the five data sets were also independently clustered. The clustering results for the different experiments are summarized in FIGS. 6B-6F and in FIGS. 20-30. FIG. 30 summarizes the clusters for each of the individual data sets and for the combined data set with all of the antibodies for the five experiments. FIG. 6B shows the dendrogram for the ANTIGEN39 antibodies for Experiment 1: Antibodies 1.12, 1.63, 1.17, 1.55, and 2.12 consistently clustered together in this experiment as well as in other experiments as do antibodies 1.46, 1.31, 2.17, and 1.29. FIG. 6C shows the dendrogram for the ANTIGEN39 antibodies for Experiment 2: Antibodies 1.57 and 1.61 consistently clustered together in this experiment as well as in other experiments.

FIG. 6D shows the dendrogram for the ANTIGEN39 antibodies for Experiment 3: Antibodies 1.55, 1.12, 1.17, 2.12, 1.11, and 1.21 consistently clustered together in this experiment as well as in other experiments. FIG. 6E shows the dendrogram for the ANTIGEN39 antibodies for experiment 4: Antibodies 1.17, 1.16, 1.55, 1.11, and 1.12 consistently clustered together in this experiment as well as in other experiments as do antibodies 1.31, 1.46, 1.65, and 1.29, as well as antibodies 1.57 and 1.61. FIG. 6F shows the dendrogram for the ANTIGEN39 antibodies for experiment 5: Antibodies 1.21, 1.12, 2.12, 2.38, 2.35, and 2.1 consistently clustered together in this experiment as well as in other experiments.

In general, the clustering algorithm produced consistent results both among the individual experiments and between the combined and individual data sets. Antibodies which cluster together or are in neighboring clusters for multiple individual data sets also cluster together or be in neighboring clusters for the combined data set. For example, the cells with back hatching indicate antibodies that consistently clustered together in the combined data set and in all of the data sets in which they were present (Experiments 1, 3, 4, and 5). Similarly, the cells with forward hatching indicate the antibodies that consistently clustered together in the combined data set and in Experiments 1, 4, and 5. These results indicate that the algorithm produces consistent clustering results both across multiple individual experiments and that it retains the consistency upon the merging of multiple data sets.

Finally, there is a high level of self-consistency for the data with regard to revealing whether or not two antibodies compete for the same epitope. The percent of antibody pairs for which the data consistently reveals whether or not they compete for the same epitope is summarized for each data set in Table 2, above. Table 2 reveals that the consistency was nearly 90% for four out of the five individual data sets as well as for the combined data set.

Example 4 Analysis of a Small Set of IL-8 Human Monoclonal Antibodies Using the Competitive Pattern Recognition Data Analysis Process

A small set of well-characterized human monoclonal antibodies developed against IL-8, a proinflammatory mediator, was used to evaluate the program applying the CPR process. Previously, plate-based ELISAs had shown that antibodies within the set bound two different epitopes: HR26, a215, and D111 recognized one epitope, whereas K221 and a33 competed for a second epitope. Further analysis using epitope mapping studies showed that HR26, a809, and a928 bound to the same or overlapping epitopes, while a837 bound to a different epitope.

In a new experiment to determine whether the CPR process was capable of correctly clustering antibodies, the process was tested on a set of seven IL-8 antibodies, including some of the monoclonal antibodies listed above. The results are summarized in the dendrograms shown in FIG. 7A. The dendrogram on the left was generated by clustering columns, and the dendrogram on the right was generated by clustering rows of the background-normalized signal intensity matrix. Both dendrograms indicated that there were two epitopes for a dissimilarity cut-off of 0.25: one epitope recognized by HR26, a215, a203, a393, and a452, and a second epitope recognized by K221 and a33.

These results using the CPR process to cluster antibodies were consistent with the data from plate-based ELISA assays summarized above. The results obtained using the CPR process indicated that the target antigen appeared to have two distinct epitopes, confirming the results seen using plate-based ELISA assays. Using the CPR process for clustering indicated that HR26 and a215 clustered together, as did K221 and a33, again consistent with the results from plate-based ELISA assays.

The degree of similarity between the two dendrograms provided a measure of the self-consistency of the analyses performed by this process. Ideally, the two dendrograms (the one on the left generated by clustering columns and the one on the right generated by clustering rows) should have been identical for the following reason: if Antibody #1 and Antibody #2 compete for the same epitope, then the intensity should be low when Antibody #1 is the reference antibody and Antibody #2 is the probe antibody, as well as when Antibody #2 is the reference antibody and Antibody #1 is the probe antibody. Likewise, when the two antibodies bind to different epitopes, the intensities should be uniformly high. By this reasoning, the degree of similarity between two rows of the signal intensity matrix should be the same as between two columns of the similarity matrix. In the present example, the dendrograms on the left- and right-hand side of FIG. 7A are nearly identical. In each case, the same antibodies appeared in the two clusters. This high level of self-consistency between row and column clusterings suggested that the experimental protocol, together with the process, produces robust results.

Example 5 Analysis of Multiple Data Sets of IL-8 Antibodies Using the Competitive Pattern Recognition (CPR) Data Analysis Process

Multiple screening experiments using IL-8 antibodies were carried out, generating multiple data sets. Normalized intensity matrices were first generated as described above for the matrices for each individual experiment. Normalized values greater than a user-defined threshold value were set to the user-defined threshold value. High-intensity values were assigned to the threshold value to prevent any single intensity value from having too much weight when the average normalized intensity value was computed for that particular pair of antibodies in a subsequent step. The rows and columns of the average normalized intensity matrix corresponded to the set of “unique” antibodies identified using the methods of the present invention. These “unique” antibodies were identified from among all the antibodies used in all the experiments. The average intensity was computed for each cell in this matrix for which there was at least one intensity value. Cells corresponding to antibody pairs with no data were identified as missing data points. Generation of the dissimilarity matrix was as described above, except that the fraction was determined based on the number of positions at which two rows differed relative to the total number of positions for which both rows had intensity values. If the two rows had no common data, then the dissimilarity value for the corresponding cell was flagged and set arbitrarily high, so the corresponding antibodies would not be grouped together as an artifact.

The clustering results for a set of monoclonal antibodies from five overlapping sets of monoclonal antibodies are summarized in FIG. 7B and Table 3 (below). These dendrograms corroborate the results showing there are two different epitopes on the target antigen. The first epitope is defined by monoclonal antibodies a809, a928, HR26, a215, and D111 and the second epitope is defined by monoclonal antibodies a837, K221, a33, a142, and a358, a203, a393, and a452. The lengths of the branches connecting the clusters indicated that, whereas the first cluster was very different from the other two, the second and third clusters were similar to each other.

To test the capacity of the CPR process to produce consistent results across separate experiments, the five data sets were also independently clustered. The clustering results for the different experiments are summarized in the dendrograms shown in FIGS. 7A, 7B, 7C, and Table B. These dendrograms demonstrated that the CPR clustering process produced consistent results among the individual experiments and between combined and individual data sets. Each dendrogram had two major branches, indicating two epitopes. Antibodies that clustered together for multiple individual data sets also clustered together or were in neighboring clusters for the combined data set. As shown in Table 3, below, there were only two minor discrepancies in the clustering results across different experiments or between an individual experiment and the combined data set, where these discrepancies are indicated by bold type in Table 3. In a data set generated in Experiment 3, D111 clustered with antibodies a33 and K221, instead of HR26 and a215. In a data set generated in Experiment 4, antibodies a203, a393, and a452 appeared in the first cluster, whereas in another experiment (as well as in the combined data set), they appeared in a second cluster. This slight difference is likely attributable to differences in individual antibody affinity between experiments in which the antibody is used as a probe antibody and experiments in which the same antibody is used as a reference antibody. Antibodies with lower affinity may have a reduced capacity to capture antigen out of the solution when used as a reference antibody. However, the overall similarity of the clustering results, as well as the grouping of the antigens, indicated that the process produced consistent clustering results that were in good agreement with results from other experiments across multiple individual experiments, and that the results remained consistent when multiple data sets were merged.

Finally, there was a high level of consistency in clustering results for each of these data sets when the process was used to cluster by rows and by columns, for the individual and combined data sets. The only discrepancy in the clustering results between row and column clusterings was with D111 in the third data set, in which it clustered with antibodies HR26 and a215 when row clustering was performed, whereas D111 clustered with antibodies a33 and K221 when column clustering was performed.

TABLE B Results of Clustering for Individual and Combined Data Sets Expt1 Expt1 Expt2 Expt2 Expt3 Expt3 Expt4 Expt4 Expt5 Expt5 Comb Comb Cluster Rows Cols Rows Cols Rows Cols Rows Cols Rows Cols Rows Cols 1 a809 a809 D111 D111 D111 HR26 HR26 HR26 HR26 HR26 a809 a809 a928 a928 HR26 HR26 HR26 a215 a215 a215 a215 a215 a928 a928 HR26 HR26 a215 a215 a215 a203 a203 D111 D111 a393 a393 HR26 HR26 a452 a452 a215 a215 2 a837 a837 a33 a33 a33 D111 a33 a33 a33 a33 a837 a837 K221 K221 K221 K221 K221 a33 K221 K221 K221 K221 a33 a33 K221 a203 a203 K221 K221 a393 a393 a142 a142 a452 a452 a358 a358 a142 a142 a203 a203 a358 a358 a393 a393 a452 a452

It will be understood by those of skill in the art that numerous and various modifications can be made without departing from the spirit of the present invention. Therefore, it should be clearly understood that the forms of the present invention are illustrative only and are not intended to limit the scope of the present invention. 

What is claimed is:
 1. An antibody competition assay method for determining antibodies that bind to an epitope on an antigen, comprising: providing a set of antibodies that bind to an antigen; labelling each antibody of said set to form a labelled reference antibody set such that each labelled reference antibody is distinguishable from every other labelled reference antibody in said labelled reference antibody set; selecting a probe antibody from said set of antibodies that bind to the antigen; contacting said probe antibody with said labelled reference antibody set in the presence of said antigen; detecting said probe antibody in a complex comprising said antigen, one labelled reference antibody bound to said antigen, and said labelled probe antibody bound to said antigen; identifying each said labelled reference antibody bound to said antigen in each said complex; determining whether said probe antibody competes with any reference antibody in said labelled reference antibody set, wherein competition indicates that said probe antibody binds to the same epitope as another antibody in said set of antibodies that bind to an antigen. 